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2031 OXIDOREDUCTASE 
Field of the inventiion 

The present invention relates to a method of screening for 
. 5 an anti-fungal agent, to fungal 2031 oxidoreductase (2031 
OR) enzymes and to diagnosis and therapy of fungal 
infections . 

Background of the invention 

10 Oxidoreductases are a major class of enzymes (EC 1) that 
catalyse oxidation-reduction (redox) reactions. Redox 
reactions involve the transfer of reducing equivalents, in 
the form of electrons or hydrogen atoms, between 
molecules, i.e., from an electron donor (or reductant) to 

15 an electron acceptor (or oxidant) - There are many 
different types of oxidoreductase important for many 
cellular processes from respiration to protein folding. 

The NADH: flavin oxidoreductase /NADH oxidase family of 
20 enzymes (InterPro reference IPR001155) contains 
approximately 2 63 members mostly of bacterial or yeast 
origin but with some plant and nematode members . Members 
of this family use flavin mononucleotide (FMN) or flavin 
adenine dinucleotide (FAD) as a tightly bound prosthetic 
25 group. The flavin prosthetic group can exist in an 
oxidised (FMN or FAD) or a reduced form (FMNH2 or FADH2) . 
These oxidoreductases use the reduced form of nicotinamide 
adenine dinucleotide (NADH) or nicotinamide adenine 
dinucleotide phosphate (NADPH) as the reductant". A variety 
30 of substrates can act as oxidants in the redox reactipn. 

Old Yellow Enzyme (OYE) is the oldest known member of this 
family of oxidoreductases (reviewed in Williams and Bruce, 
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2.002, Microbiology 148, 1607-1614). OYEl (EC 1.6.99.1) was 
isolated from brewer' s bottom yeast by Warburg & Christian 
(1932, Naturwissenschaften 20, 68 8) and was the first 
enzyme for which a cofactor was shown to be required 
5 (Theorell, 1935, Biochem. Z. 275, 344-346). This yellow 
cofactor was found to be riboflavin 5' -phosphate (also 
known as flavin mononucleotide, FMN) . There are 2 OYEs 
known in Saccharomyces cerevisiae (0YE2 & OYE3) and 2 in 
Schlzosaccharomyces pombe. A great deal is known about the 
10 biochemical mechanism and structure of the enzyme, 
however, the precise physiological role of the enzyme 
remains to be elucidated. 

OYE has NADPH dehydrogenase activity (see reaction 1 
15 below) . The reduced enzyme catalyses the reduction of a/p- 
uns atur a ted carbonyl compounds including cyclohexenone 
(see reaction 2), durbquinone, menadione and N- 
ethylmaleimide . 

20 (1) 

Enz-FMN + 2NADPH O Enz-FMNHg + 2NADP"** 

(2) • 

Enz-FMNHa + 2-cyclohexenone O Enz-FMN + cyclohexanone 



25 



a 

6 



It has been speculated that OYE may be involved in sterol 
metabolism (Stott et al, 1993, J. Biol. Chem. 268: 6097- 
6106) or may be part of the antioxidant defence machinery 
30 involved in detoxification of, for example, lipid 

peroxidation breakdown products (Kohli & Massey, 1998, J. 
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Biol. Chem. 273, 32763-32770). Neither 0YE2 nor 0YE3 are 
essential for S. cerevisiae. ( http: //genome- 

www4 ■ stanfQrd.edu/cgi-bin/SGD/locus ,pl?locus=SQQQ1222 ; 
http : //db . yeastgenome . org/cgibin/SGD/locus . pl?locus=YPL171 
5 C) 

Bacterial members of the NADH: flavin oxidoreductase family 
include Escherichia calx N-ethylmaleimide reductase, 
Pseudomonas putlda MIO .morphinone reductase, j&nterojbacter 
10 cloacae PB2 penterythritol tetranitrate- reductase and 
Azoarcus evansli 2-aminobenzoyl-CoA 

monooxygenase/reductase (Schuhle et al., 2001, J. 
Bacteriol. 183, 5268-5278). 

15 Summary of the invention 

The inventors have found a* gene fdr an oxidoreductase of 
the NADH: flavin oxidoreductase type to be essential for 
the viability of fungal cells. This finding allows the 
identif icati9n of anti-fungal agents based on their 

20 ability to target the oxidoreductase. 

The invention provides a new group of oxidoreductases 
which are herein .referred to as 2031 oxidoreductases (2031 
ORs) which can be used to screen for anti-fungal agents. 

25 In particular 2031 oxidoreductases from Aspergillus 
fumigatuSf Aspergillus nidulans^ Candida albicans^ 
Colletotrichium trifoliif Fusarium gramineariim (anamoirph 
Gibberella zeae) Fusarium sporotrichoides ^ Magnaporthe 
grisear Neurospora crassa Schizosaccharomyces pombe and 

30 Ustilago maydis (see Table I) are provided.' 2031 OR 
defines a novel set of oxidoreductases, related to but 
distinct from OYE and its close relatives, which are 
essential for the viability of fungal cells. 



( 
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Accordingly the invention provides the following: 

- a method of identifying an anti-fungal agent which 
targets an essential protein or gene of a fungus 

5 comprising contacting a candidate substance with 

(i) a NADH: flavin oxidoreductase protein which 
comprises the sequence shown by SEQ ID NO: 3, 

(ii) a NADH: flavin oxidoreductase protein which is a 
homologue of (i) and which comprises the sequence shown by 

10 SEQ ID NO: 8, 12, 14, 19, 24, 42, 44, 83 or 85, 

(iii) a protein which has 50% identity with (i) or 

(ii) . 

(iv) a protein comprising a fragment of (i) , (ii) or 

(iii) which fragment has a length of at least 50 amino 
15 acids, 

(v) a polynucleotide that comprises sequence which 
encodes (i) , (ii) , (iii) or (iv) , 

(vi) a polynucleotide comprising sequence which has at 
least 70% identity with the coding sequence of (v) , 

20 and determining whether the candidate substance binds or 
modulates (i) , (ii) , (iii) , (iv) , (v) or (vi) , wherein 
binding or modulation of (i) , (ii) , (iii), (iv) , (v) or 
(vi) indicates that the candidate substance is .an anti- 
fungal agent, 

25 - use of (i) , (ii) , (iii), (iv) , (v) or (vi) as defined 
above to identify or obtain an anti-fungal agent, 

- use of an anti-fungal agent identified by the method of 
the invention in the manufacture of a medicament for 
prevention or treatment of fungal infection, 

30 - a method of detecting the presence of a fungus in a 
sample comprising detecting the presence in the said 
sample of a protein or polynucleotide of the invention. 
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- an isolated protein or polynucleotide of the invention, 

- an organism which is transgenic for a polynucleotide of 
the invention, 

- an organism which has been genetically engineered to 
5 render a polynucleotide or protein of the invention non- 
functional or inhibited.. 

- an antibody which is specific for a protein of the 
invention, 

- a method for preventing or treating a fungal infection 
10 comprising administering an anti-fungal agent identified 

by the screening method of the invention, and 

- a fungus which has been killed, or whose growth has 
been impaired, by inhibition of the expression or activity 
of a protein or polynucleotide of the invention. 

15 

Detailed description of the invention 

As mentioned above the invention relates to use of 
particular protein and polynucleotide sequences (termed 
^^proteins of the invention" and ^'polynucleotides of the 
20 invention" herein) which are of, or derived from, fungal 
oxidoreductase proteins and polynucleotides (including 
homologues and/or fragments of the fungal oxidoreductase 
proteins and polynucleotides) to identify anti-fungal 
agents . 

25 

As used herein, the term oxidoreductase" (''^OR") may be 
defined as an enzyme or which is capable of catalysing an 
oxidation or reduction reaction. The protein of the 
invention may have an oxidation or reduction activity, 
30 such any such activity mentioned herein. The ORs of the 
invention generally fall within classification ECl of the 
enzyme commission. 
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An essential fungal gene may be defined as one which, when 
disrupted genetically (for example when not expressed) in 
a fungus, prevents survival or significantly retards 
growth of the cell on minimal or defined medium, or in 
5 guinnea pigs, mice, rabbits or rats infected with the 
fungus In one embodiment the protein of the invention is 
able to complement such an effect of the genetic 
disruption. Thus the protein may cause survival 

(viability) of a fungal cell which does not express its 
10 native 2031 oxidoreductase . 

A protein or polynucleotide of the invention (or a fungal 
^^2031 OR" gene, nucleic acid or protein) may be defined by 
similarity in sequence to a another member of the family, 
15 As mentioned above this similarity may be based on 
percentage identity (for example to the sequences shown in 
the sequence listing) . 

A protein or polynucleotide of the invention may comprise 
20 one or more of the motifs defined by regions 1 - 11 of 
Figures 1 and 2 (marked at the top of the Figures) of any 
of the sequences shown. Thus a protein of the invention 
may comprise one or more of motifs 1 - 11 as shown for SEQ 
ID NO: 3 and a polynucleotide of the invention may comprise 
25 one or more of motifs 1 - 11 as shown for SEQ ID N0:1. 

Typically the motif is present in substantially the same 
location as the equivalent location shown in Figure 1 or 
2, The equivalent location can be deduced, for example, 
30 using any suitable algorithm mentioned herein. In one 
embodiment the protein or polynucleotide also comprises 
sequence flanking the motif as shown in Figures 1 or 2 
such as sequences of length at least 10, 20 or 30 amino 
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acids /nucleotides flanking the N terminal side and/or C 
terminal side;, or 5' and/or 3' side, of the motif; or 
sequence which has percentage identity with the flanking 
sequence. 

5 

The protein of the invention typically comprises at least 
2f 2r 5, 8 or 11 of the motifs shown in Figures 1 and 2. 
The protein preferably comprises at least motif no. 6 
and/or motif no . 9 . 

10 . . 

The protein or polynucleotide of the invention may align 
with other 2031 OR polynucleotides or proteins (as shown 
in SEQ ID Nos . 1-44 and 82-85) showing a greater identity 
to these than to Old Yellow Enzyme family polynucleotides 
15 or proteins 

The protein or polynucleotide of the invention typically 
clusters with other 2031 OR polynucleotides or proteins 
.(as shown in SEQ ID Nos. 1-44 and 82-85) rather than Old 
20 Yellow Enzyme family polynucleotides or proteins after 
phylogenetic analysis, for example with a bootstrap value 
of greater than 60%. 

In one embodiment the protein of the invention has a 
25 sequence which matches PFAM profile ^'oxidored FMN", or 
INTERPRO profile IPR001155 (for example with an Evalue of 
e-50 or less) and is closer to a 2031 OR shown in any one 
of SEQ ID Nos -1-44 and 82-85 than to Old Yellow Enzyme 
family proteins . 

30 

The protein or polynucleotide of the invention may be in 
isolated form (such as non-cellular form) , for example 
when used in the method of the invention. Preferably, the 
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isolated polynucleotide comprises a 2031 OR gene. 
Preferably, the isolated protein comprises a 2031 OR. The 
polynucleotide may comprise native, synthetic or 
recombinant polynucleotide, and the protein may comprise 
5 native, synthetic or recombinant protein. The 
polynucleotide or protein may comprise combinations of 
native, synthetic or recombinant polynucleotide or 
protein, respectively. The polynucleotides and proteins of 
the invention may have a sequence which is the same as, or 
10 different from, naturally occurring 2031 OR 
polynucleotides and proteins. 

It is to be' understood that the term '^^isolated from'' may be 
read as »^^of" herein. Therefore references to pplynucleotides 
15 and proteins being '^^isolated from" a particular organism 
include polynucleotides and proteins which were prepared by 
means other than obtaining them from the organism, such as 
synthetically or recombinantly . 

20 Preferably, the polynucleotide or protein, is isolated 
from a fungus, more preferably a filamentous fungus, even 
more preferably an Ascomycete. 

Preferably, the polynucleotide or protein, is isolated 
25 from an organism selected from Aspergillus; Blumeria; 

Candida; Colletotrichium; Cryptococcus ; Encephalitozoon; 

Fusarium; Leptosphaeria ; Magnaporthe; Mycosphaerella ; 

Neurospora r Phytophthora ; Plasmopara; Pneumocystis; 

Pyricularia; Pythium; Puacinia ; Rhizoctonia; 

30 Schizosaccharomyces , Trichophyton; and Ustilago . 

Preferably, the polynucleotide or protein, is isolated 
from an organism independently selected from a group of 
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genera consisting of Aspergillus ^ Candida^ 

Colletotrichium^ Fusariuxtir Magnaporthe , Mycosphaerella ^ 
Neurospora r Schizosaccharomyces and Ustllago . 

5 Preferably, the polynucleotide or protein, is isolated 
from an organism selected from the species Aspergillus 
flavus; Aspergillus fumigatus; Aspergillus nidulans; 
Aspergillus niger; Aspergillus parasiticus; Aspergillus 
terreus; Blumeria ^graminis/ . Candida albicans; ^Candida 

10 cruzei/ Candida glabrata; Candida parapsilosis ; Candida 
tropicalis ; Colletotrichium trifolii; Cryptococcus 
neoformans'; Encephalitozoon cuniculi; Fusarium 

graminarium; Fusarium solani; Fusarium sporotrichoides ; 
Leptosphaeria nodorum; Magnaporthe grisea; Mycosphaerella 

15 graminicola; Neurospora crassa'/ Phytophthora capsici; 
Phytophthora infestans; Plasmopara viticola; Pneumocystis 
jiro^reci; Puccinia coronata; Puccinia graminis; 
Pyricularia oryzae; Pythium ultimum; Rhizoctonia solani; 
Schizzosaccharomyces pombe/ Trichophyton interdigitale ; 

20 Trichophyton rubrum; and Ustilago maydis . 

Preferably, the polynucleotide or protein, is isolated 
from an organism selected from Aspergillus fumigatus; 
Aspergillus nidulans^ Candida albicans r Colletotrichium 
25 trifolii^ Fusarium graminearum^^ Fusarium sporotrichoides^ 
Magnaporthe grisea r Mycosphaerella graminicola^ Neurospora 
crassar Schizosaccharomyces pombe and Ustilago maydis. 

The polynucleotide, and preferably the protein, may be 
30 isolated from A. fumigatus AF293. 
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Table I, 2031 OR sequences claimed and their relationship 
to sequences given in the sequence listing. 





gDNA/EST^ 


Coding 

sequence (cDNA/itiR 
NA) w/o UTRs^ 


Protein 


A, fu'migatus 

Oxidoreductase 

2031 


SEQ ID No. 1: 

299-469, 

520-1618 


SEQ ID No. 2: 
115-1384 


SEQ ID. 
No. 3 


A. fumigatus 

Oxidoreductase 

4929 


SEQ ID No- 4: 

1-180, 

267-1352 


SEQ ID No. 5: 1- 
1266 


.SEQ ID No. 
6 


A. fumigatus 

Oxidoreductase 

1495 


SEQ ID No. 7: 
1-1329 


SEQ ID No. 7: 1- 
1329 


SEQ ID No. 
8 


A- nidulans 
1_112 


SEQ ID No. 9: 
1-1269 


SEQ ID No. 9: 
1-1269 


SEQ ID No. 
10 


C. albicans 
2431 


SEQ ID No. 11: 

1-1299 . 


SEQ ID No, 11 
1-1299 


SEQ ID No. 

12 


C. albicans 
2464 


SEQ ID No. 13: 
1-1110 


SEQ ID No. 13: 
1-1110 


SEQ ID No. 
14 


N. crassa 
NCU07452 .1 


SEQ ID No. 15: 
1-1305 


SEQ ID No. 15: 
1-1305 


SEQ ID No. 
16 


N, crassa 

Oxidoreductase 

NCU08900 


SEQ ID No; 17: 
1-924, 1015- 
1362, 1435-1476 


SEQ ID No. 18: 
1-1314 


SEQ ID No. 
19 


M. grisea 
MG04569.3 (pred 
gene) 


SEQ ID No. 20: 
1-726, 810-1412 


SEQ ■ ID No. 21: 
1-1329 


SEQ ID 
No, 22 


S. pombe T39956 


SEQ ID No. 23: 


SEQ ID No. 23: 


SEQ ID No, 
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1-1188 


1-1188 


24 


C. . trlfolii 
(EST asseitibly) 


SEQ ID No. 25: 
130-777 


SEQ ID No. 26: 
1-645 


SEQ ID No. 
27 


F. 

sporo tri choides 
FsCon[0063] 
(EST assembly)" 


SEQ ID No. 28: 
103-803 


SEQ ID No. 29: 
1-701 


SEQ ID No. 
30 


F. 

sporotri choides 
FsCon[0237] 
(EST assembly) 


SEQ ID No. 31: 
76-631 (rev 
comp) 


SEQ ID No. 32: . 
1-556 


SEQ ID 
No. 33 


F. 

sporo trichoides 
FsCon[0458] 
(EST assembly) 


SEQ ID No. 34: 
174-657 


SEQ ID No, 34: 
174-657 


SEQ XD 
No. 35 , 


F. graminearum 
15771741 (EST) 


SEQ ID No. 36: 
1-744 


SEQ ID No. 37: 
1-742(4) 


SEQ ID 
No. 38 


F. graminearum 
FG00074.1 


SEQ ID No. 82: 
1-1326 


SEQ ID No. . 82: 
1-1326 


SEQ ID No. 

83 


M, gramlnlcola 
mg[0281] (EST) 


SEQ ID No. 39: 
1-647 


SEQ ID No. 39: 
1-647 


SEQ ID 
No. 40 


M. gramlnlcola 
mga0328f (EST) 


SEQ ID No. 41: 
1-560 


SEQ ID No. 41: 
1-560 


SEQ ID 
No. 42 


M. grlsea 
MG03823.3 


SEQ ID No, 43: 
1-1254 


SEQ ID No. 43: 
1-1254 


SEQ ID 
No. 44 


Ustllago maydis 


SEQ ID No. 84: 


SEQ ID No. 84: 


SEQ ID No. 
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Contig 1.2 


1-1350 


1-1350 


85 


'■^'Numbers after 


SEQ ID Nos. 


correspond to bases 


of 


genomic 



DNA encoding the protein - 

^^^RNA sequences are given in the sequence listing with 
Thymidine (T) , although it is understood that in vivo 
5 Uridine (U) would be present. 

^^^NA one-base deletion at position 690 of the EST (SEQ ID 
No. 22) is required to give the best predicted 
cDNA/proteih , 

^^^Two single base deletions are required to optimise 
10 translation. 

Bioinf ormatics analysis was carried out to identify 
functionally important regions within the fungal 2031 ORs . 
The 2031 ORs are related to but distinct from the ''^Old 

15 Yellow Enzyme'' (OYE) group of yeast enzymes, which also 
includes ergosterol-binding protein of Candida albicans . 
Comparison of the 2031 ORs with crystal structures of OYE 
family proteins identified highly conserved residues 
responsible for the catalytic function of these enzymes. 

20 However, the comparisons also identified seven clusters of 
residues conserved in 2031 enzymes but not OYE enzymes 
which flanked the substrate binding site and were 
therefore implicated in determining substrate specificity 
(regions 2, 4, 6, If Q, 10, and 11 in Figures 1 and 2, and 

25 Example 4 .hereinafter) . Four further conserved clusters of 
residues were identified which, while not predicted -to be 
involved in catalysis, were conserved in 2031 but not OYE 
and so also distinguish 2031 ORs from OYEs (regions 1, 3, 
5, and 9 in Figures 1 and 2, and Example 4 hereinafter) . 

30 ' 

Variants of the above mentioned polynucleotides and proteins 
are also provided, and are discussed below. 
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In one embodiment, the protein of the invention may 
comprise an amino acid sequence substantially as set out 
and independently selected from regions 1 - 11 of any of 
SEQ ID Nos 3, 6, 8, 10, 12, 14, 16, 19, 22, 24, 27, 30, 
5 33, 35, 38, 40, 42, 44, 83 or 85 as given in Figure 1, or 
variants thereof. At least one region or motif may be 
functional . 

The polynucleotide of the invention may comprise DNA, such 
10 as genomic DNA. The polynucleotide may comprise a sequence 
substantially as set out and independently selected from 
regions - 1 - 11 of any of SEQ ID Nos. 1, 4, 7, 9, 11, 13, 
15, 17, 20, 23, 25, 28, 31, 34, 36, 39 41, 43, 82 or 84 as 
given in Figure 2, or complements, or variants thereof. 

15 

Preferably, the polynucleotide encodes a fungal 2031 OR 
protein which comprises substantially the amino acid 
• sequences SEQ ID Nos 3, 6, 8, 10, 12, 14, 16, 19, 22, 24, 
27, 30, 33, 35, 38, 40, 42, 83 or 85 or a variant thereof. 

20 

The polynucleotide may comprise RNA, preferably mRNA, 
preferably spliced mRNA. Preferably, the polynucleotide 
comprises substantially the sequence shown as SEQ ID Nos 
2, 5, 7, 9, 11, 13, 15, 18, 21, 23, 26, 29, 32, 34, 36, 
25 37, 39, 41, 43, 82 or 84 or a complement, or a variant 
thereof. 

Preferably, the protein comprises substantially the 
sequences SEQ ID Nos. 3, 6, 8, 10, 12, 14, 16, 19, 22, 24, 
30 27, 30, 33, 35, 38, 40, 42, 44, 83 or 85 or a variant 
thereof. 

Preferably, the protein is encoded by the regions of 
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sequences SEQ ID Nos . 1, 4, 1, 9, 11, 13, 15, 17, 20, 23, 
25, 26, 28, 29, 31, 34, 36, 39, 41, 43, 82 or 84 as 
described in Figure 1. in the column ^^gDNA/EST" .in Table 
I, or a complement, or a variant thereof. 

5 

The polynucleotide may comprise- substantially a nucleotide 
sequence region or motif independently selected from at 
least one of regions. 1-11 from at least one of the 
sequences SEQ ID Nos- 1, 2, 4, 5, 7, 9, 11, 13, 15, 17, 
10 18, 20, 21, 23, 25, 25, 28, 29, 31, 32, 34, 36, 37, 39, 
41, 43, 82 or 84, as given in Figure 2, or a complement, 
or a variant thereof. 

Preferably, the - isolated polynucleotide comprises 
15 substantially a nucleotide sequence independently selected 
from the regions and sequences given in the column 
"^gDNA/EST" in Table I- 

Preferably, the protein is encoded by a polynucleotide 
20 which polynucleotide comprises substantially a sequence 
independently selected from at least one of the the 
regions and sequences given in the column ^^gDNA/EST" in 
Table I, or a complement or, a variant thereof. 

25 By the term ^^native amino acid/polynucleotide /protein", is 
meant an amino acid, polynucleotide or protein produced 
naturally from biological sources either ±n vivo or in 
vitro. 

30 By the term ^^synthetic amino acid/polynucleotide/protein", 
is meant an amino acid, polynucleotide or protein which 
has been produced artificially or de novo using a DNA or 
protein synthesis machine known in the art. 



15 



By the term ^^recombinant amino acid/polynucleotide 
/protein'', is meant an amino acid, polynucleotide or 
protein which has been produced using recombinant DNA or 
5 "protein technology or methodologies which are known to the 
skilled technician. 

The term variant'', and the terms ^^substantially the amino 
acid/polynucleotide/protein sequence" are used herein to 

10 refer to related sequences. As discussed below such 
related sequences are typically . homologous to (share 
percentage identity with) a given sequence, for example 
over the entire length of the sequence or over a portion 
of a given length. The related sequence may also be a 

15 fragment of the sequence or of a homologous sequence. A 
variant protein may be encoded by a variant 
polynucleotide - 

By the term ^Variant", and the terms ^^substantially the amino 
20 acid/polynucleotide/protein sequence", we mean that the 
sequence has at least 30%, preferably 40%, more preferably 
50%, and even more preferably, 60% sequence identity with the 
amino acid/polynucleotide/protein sequences of any one of the 
sequences referred to. A sequence which is ^^substantially the 
25 amino acid/polynucleotide/peptide sequence" may be the same 
as the relevant sequence. 

Calculation of percentage identities between different 
amino acid/polynucleotide/protein sequences may be carried 
30 out as follows . A multiple alignment is first generated by 
the ClustalX program (pairwise parameters : gap opeining 
10.0, gap extension 0.1, protein matrix Gonnet 250, DNA 
matrix lOB; multiple parameters: gap opening 10.0, gap 
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extension 0.2, delay divergent sequences 30%, DNA 
transition weight 0.5, negative matrix off, protein matrix 
gonnet series, DNA weight lUB; Protein gap parameters, 
residue-specific penalties on, hydrophilic penalties on, 
5 hydrophilic residues GPSNDQERK, gap separation distance 4, 
end gap separation off) . The percentage identity is then 
calcinated from the multiple alignment as {N/T)*100, where 
N is the number of positions at which the two sequences 
share an identical residue, and T is the total number of 

10 positions compared. Alternatively, percentage identity can 
be calculated as {N/S)*100 where S is the length of the 
shorter sequence being compared. The amino 
acid/polynucleotide/protein seqences may be synthesised de 
novo, or may be native amino acid/polynucleotide/protein 

15 sequence, or a derivative thereof. 

An amino acid/polynucleotide/protein sequence with a 
greater identity than 65% to any of the sequences referred 
to is also envisaged. An amino acid/polynucleotide/protein 

20 sequence with a greater identity than 70% to any of the 
sequences referred to is also envisaged. An amino 
acid/polynucleotide/protein sequence with a greater 
identity than 75% to any of the sequences referred to is 
also envisaged. An amino acid/polynucleotide/protein 

25 sequence with a greater identity than 8 0% to any of the 
sequences referred to is also envisaged. Preferably, the 
amino acid/polynucleotide/protein sequence has 85% 
identity with any of the sequences referred to, more 
preferably 90% identity, even more preferably 92% 

30 identity, even more preferably 95% identity, even more 
preferably 97% identity, even more preferably 98% identity 
and, most preferably, 9 9% identity with any of the 
referred to sequences. 
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The above mentioned percentage identities may be measured 
over the entire length of the original sequence or over, a 
region of 15, 20, 50 or 100 amino acids/bases of the 
5 original sequence. In a preferred embodiment percentage 
identity is measured with reference to SEQ ID No, 3- 
Preferably the variant protein has at least "40% identity, 
such as at least 60% or at least 80% identity with SEQ ID 
No. 3 or a portion of SEQ ID Nol 3. 

10 

' Alternatively, a substantially similar nucleotide sequence 
will be encoded by a sequence which hybridizes to the 
sequences shown in SEQ ID Nos • 1, 2, 4, 5, 7, 8, 9, 11, 13, 
15, 17, 18, 20, 21, 23, 25, 26, 28, 29, 31, 32, 34, 36, 37, 

15 39, 41, 43, 82 or 84 or their complements under stringent 
conditions. By stringent conditions, we mean the nucleotide 
hybridises to filter-bound DNA or RNA in 6x sodium 
chloride /sodium citrate (SSC) at approxmiately 45°C followed 
by at least one wash in 0.2x SSC/0.1% SDS at approximately 5- 

20 65*^C. Alternatively, a substantially similar protein may 
differ, by at least 1, but less than 5, 10, 20, 50 or 100 
amino acids from the sequences shown in SEQ ID Nos. 3, 6, 8, 
10, 12, 14, 16, 19, 22, 24, 27, 30, 33, 35, 38, 40, 42, 44, 
83 or 85. Such differences may each be additions, deletions 

25 or substitutions. 

Due to the degeneracy of the genetic code, it is clear 
that any nucleic acid sequence could be varied or changed 
without substantially affecting the sequence of the 
30 protein encoded thereby, to provide a functional variant 
thereof. Suitable nucleotide variants are those having a 
sequence altered by the substitution of different codons 
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that encode the same amino acid within the sequence,, thus 
producing a silent change. 

Other suitable variants are those having homologous 
5 nucleotide sequences but comprising all, or portions of, 
sequence which are altered by the substitution of 
different codons that encode an amino acid with a side 
chain of similar biophysical properties to the amino acid 
it substitutes, to produce a conservative change. For 

10 example small non-polar, hydrophobic amino acids include 
glycine, alanine, leucine, isoleucine, valine, proline, 
and methionine- Large non-polar, hydrophobic amino acids 
include phenylalanine, tryptophan and-*, tyrosine. The polar 
neutral amino acids include serine, threonine, cysteine, 

15 asparagine and glutamine.. The positively charged (basic) 
amino acids include lysine, arginine and histidine. The 
negatively charged (acidic) amino acids include aspartic 
acid and glutamic acid. Certain organisms, including 
Candida are known to use non-standard codons compared to 

20 those used in the majority of eukaryotes. Any comparisons 
of polynucleotides and proteins from such organisms with 
the sequences given here should take these differences 
into account. 

25 In accurate alignment of protein or DNA sequences the 
trade-off between optimal matching of sequences and the 
introduction of gaps to obtain such a match is important. 
In the case of proteins, the means by which matches are 
scored is also of significance. The family of PAM matrices 

30 (e.g., Dayhoff, M, et al . , 1978, Atlas of protein sequence 
and structure, Natl. Biomed. Res. Found.) and BLOSUM 
matrices quantitate -the nature and likelihood of 
conservative substitutions and are used in multiple 
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alignment algorithms, although other, equally applicable 
matrices will be known to those skilled in the art. The 
popular multiple alignment program ClustalW, and its 
windows version ClustalX (Thompson et al . , 1994, Nucleic 
5 Acids Research, 22, 4 673-4 68 0; Thompson et al., 1997, 
Nucleic Acids Research, 24, 4876--4882) are efficient ways 
to generate multiple alignments of proteins and DNA. 

Use of the Align program is also preferred 
10 (http: //www, gwdg.de/-dhepper/download/; Hepperle, D. , 
2001: Multicolor Sequence Alignment Editor. Institute of 
Freshwater Ecology and Inland Fisheries, 16775 Stechlin, 
Germany) , although others, such as jalView or Cinema are 
also suitable. 

15 

Calculation of percentage identities between proteins 
occurs during the generation of multiple alignments by 
Clustal. However,' these values need to be recalculated if 
the alignment has been manually improved, or for the 

20 deliberate comparison of two sequences. Programs that 
calculate this value for pairs of protein sequences within 
an alignment include PROTDIST within the PHYLIP phylogeny. 
package (Felsenstein; http : //evolution. gs .Washington , edu/ 
phylip.html) using the '^Similarity Table" option as the 

25 model for amino acid substitution (P) . For DNA/RNA, an 
identical option exists within the DNADIST program of 
PHYLIP. 

Other modifications in protein sequences are also 
30 envisaged and within the scope of the claimed invention, 
i.e. those which occur during or after translation, e.g. 
by acetylation, amidation, carboxylation, phosphorylation, 
proteolytic cleavage or linkage to a ligand. 
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The term ^^variant", and the terms ^'substantially the amino 
acid/polynucleotide/protein sequence'' also include a fragment 
of the relevant polynucleotide or protein sequences, 
5 including a fragment of the homologous sequences (which have 
- percentage identity to a specified sequence) referred to 
above. A polynucleotide fragment will typically comprise at 
least 10 bases, such as at least 20, 30, 50, 100, 200, 500 or 
1000 bases. A protein fragment will typically comprise at 
10 least 10 amino acids, such as at least 20, 30, 50, 80, 100, 
150, 200, 300, 400 or 500 amino acids. The fragments may lack 
at least 3 amino acids, such as at least 10, 20 or 30 amino 
acids of the amino acids from either end of the protein. 

15 The invention provides a method of screening which may be 
used to identify modulators of 2031 OR proteins or 
polynucleotides, such as inhibitors of expression or 
activity of the proteins or polynucleotides of the 
invention. In one embodiment of the method a candidate 

20 substance is contacted with a protein or pqlynucleotide of 
the invention and whether or not the candidate substance 
binds or modulates the protein or polynucleotide is 
determined . 

•25 The modulator may promote (agonise) or inhibit 
(antagonise) the activity of the protein. A therapeutic 
modulator (against fungal infection) will inhibit the 
expression or activity of protein or polynucleotide of the 
invention . 

30 

The method may be carried out In -^itro (inside or outside 
a cell) or in vivo. In one embodiment the method is 
carried out on a cell, cell culture cell extract. The 
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cell may or may not be a cell in which the polynucleotide 
or protein is naturally present* The cell may or may not 
be a fungal cell, or may or may not be a cell of any of 
the fungi mentioned herein. The protein or polynucleotide 
5 may be present in a non-cellular form in the method, thus 
the protein may be in the form of a recombinant protein 
purified from a cell. 

Any suitable binding or activity assay may be used. 

10 Methods which determine whether a candidate substance is 
able to bind the protein or polynucleotide may comprise 
providing the protein or polynucleotide to a candidate 
substance and determining whether binding occurs, for 
example by measuring the amount of the candidate substance 

15 which binds the protein or polynucleotide. The binding 
may be determined by measuring a characteristic of the 
protein or polynucleotide that changes upon binding, such 
as spectroscopic changes. 

20 The assay format may be a ^band shift' system. This 
involves determining whether a test candidate advances or 
retards the protein or polynucleotide on gel 
electrophoresis relative to the absence of the compound. 

25 The method may be a competitive binding method. This 
determines whether the candidate is able to inhibit the 
binding of the protein or polynucleotide to an agent which 
is known to bind to the protein or polynucleotide, such as 
an antibody specific for the protein. 

30 

Whether or not a candidate substance modulates the 
activity of the protein may be determined by providing the 
candidate substance to the protein under conditions that 
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permit activity of the protein, and determining whether 
the candidate substance is able to modulate the activity 
of the product . 

5 The activity which is measured may be any of • the 
activities of the protein of the invention mentioned 
herein, such as oxidoreductase activity. In one 

embodiment the screening method comprising carrying out a 
redox reaction in the presence and absence of the 

10 candidate substance to determine whether the candidate 
substance inhibits the oxidoreductase activity of the 
protein of the invention, wherein the redox reaction is 
carried out by contacting said protein with NADH or NADPH; 
and an electron acceptor, under conditions in which in the 

15 absence of the candidate substance . the protein catalyses 
reduction of the electron acceptor. 

In a preferred embodiment the inhibition of the redox 

reaction is * measured by detecting the amount of NADH or 
20 NADPH oxidation, for example by measuring the generation 

of the oxidised forms of NADH and NADPH spectroscopically . 

This can be done by measurement at 340nm (see Example 7) . 

Alternatively, a suitable colourimetric oxidoreductase 

substrate may be used to measure inhibition^ such as 
25 methylene blue, phenazine methosulphate or 2, 6- 

dichlorophenolindophenol . 

Suitable candidate substances which can tested in the 
above methods include antibody products (for example, 
30 monoclonal and polyclonal antibodies, single chain 
antibodies, chimeric antibodies and CDR-grafted 
antibodies) . Furthermore, combinatorial libraries, defined 
chemical identities, peptide and peptide mimetics. 
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oligonucleotides and natural product libraries , such as 
display libraries (e.g. phage display libraries) may also 
be tested. The candidate substances may be chemical 
compounds . Batches of the candidate substances may be 
5 used in an initial screen of;, for example, ten substances 
per reaction, and the substances from batches which show 
inhibition tested indiTridually , 

According to a further aspect of the present invention, 
10 there is provided a polynucleotide or protein of the 
invention for use as a medicament or in diagnosis - 

The polynucleotide or protein may be modified prior to 
use, preferably to produce a derivative or variant 

15 thereof., The polynucleotide or protein may be derivatised. 
The protein may be modified by epitope tagging, addition 
of fusion partners or purification tags such as 
glutathione S^transf erase, multiple histidines or maltose 
binding protein, addition of green fluorescent protein, 

20 covalent attachment of molecules including biotin or 
fluorescent tags, incorporation of selenomethionine, 
inclusion or attachment of radioisotopes or 
fluorescent/non-fluorescent lanthanide chelates . The 
polynucleotide may be modified by methylation or 

25 attachment of digoxygenin (DIG) or by addition of sequence 
encoding the above tags, prpteins or epitopes. 

Preferably, the medicament is adapted to retard or prevent 
a fungal infection. The fungal infection may be in human, 
30 animal or plant. The polynucleotide or protein may be used 
for the development of a drug. The polynucleotide or 
protein may be used in, or for the generation of, a 
molecular model of said polynucleotide or said protein. 
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According to a further aspect of the present invention, 
there is provided use of a polynucleotide or protein of 
the invention for the preparation of a medicament for the 
5 treatment of a fungal infection - 

The polynucleotide or protein may be modified prior to 
use, preferably to produce a derivative or variant 
thereof. The polynucleotide or protein may be derivatised, 
10 The polynucleotide or protein- may not be modified or 
derivatised. 

Preferably, the medicament is adapted to retard or prevent 
a fungal infection. The treatment may comprise retarding 

15 or preventing fungal infection- Preferably, the drug 
and/or medicament comprises an inhibitor, preferably a 
2031 OR inhibitor. Preferably,' the drug or medicament is 
adapted to inhibit expression and/or activity of the 
polynucleotide or a fragment thereof, and/or the function 

20 of the protein or a fragment thereof. 

Preferably, the fungal infection comprises an infection by 

a fungus, more preferably an Ascomycete, and even more 

preferably, an organism selected from the genera 
25 Aspergillus; Blumerla ; Candida; Colletotrlchlum; 

Cryptococcus ; Encephalltozoon; Fusarlum; Leptosphaerla ; 

Magnaporthe; Mycosphaerella ; Neurospora ^ Phytophthora ; 

Plasmopara; Pneumocystis ; Pyrlcularla ; Pythlum; Pucclnla ; 

Rhlzoctonla; Schlzosaccharomyces ^ Trichophyton ; and 
30 Ustllago. 

Preferably, the fungal infection comprises an infection by 
an organism selected from the genera Aspergillus ^ Candida ^ 



25 



Colletotrlchlum^ Fusarluixif Magnaporthe, Mycosphaerella and 
Ustllago , 

Preferably^ the fungal infection comprises an infection by 
5 an organism selected from the species Aspergillus flatus; 
Aspergillus fumigatus; Aspergillus nidulans; Aspergillus 
niger; Aspergillus parasiticus ; Aspergillus terreus ; 
Blumeria graminis ; Candida albicans ; Candida cruzei; 
Candida glabrata ; Candida parapsilosis ; Candida 

10 tropicalis ; Colletotrichium trifolii ; Cryptococcus 
neoformans ; Encephalitozoon cuniculi; Fusarium 

graminarium; Fusarium solani; Fusarium sporotrichoides ; 
Leptosphaeria nodorum; Magnaporthe grisea; Mycosphaerella 
graminicola ; Phytophthora capsici; Phytophthora infestans ; 

15 Plasmopara viticola ; Pneumocystis j iroveci; Puccinia 
coronata ; Puccinia graminis ; Pyricularia oryzae; Pythium 
ultimum; Rhizoctonia solani; . Trichophyton interdigitale ; 
Trichophyton rubrum; and Ustilago maydis . 

20 Preferably, the fungal infection comprises an infection by 
an organism selected from the species Aspergillus 
fumigatus; Aspergillus nidulans r Candida albicans ^ 
Colletotrichium trifolii r Fusarium graminearum^ Fusarium 
sporotrichoides ^ Magnaporthe grisea, Mycosphaerella 

25 graminicola and Ustilago maydis. 

According to another aspect of the present invention, 
there is provided a method of detecting the presence of a 
fungal infection in an individual, said ^method 
30 comprising 

(i) obtaining a sample from an organism; and 

(ii) detecting in the said sample the presence of a 
polynucleotide or protein of the invention. 
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The individual may be a person (human) or animal (such as 
a mammal or bird) or a plant. The fungal infection may 
arise from infection with an organism selected from the 
5 genera Aspergillus; Blumeria; Candida; ■ Colletotrichium; 
Cryptococcus ; Encephalitdzoon; Fusarlum; Leptosphaeria ; 
Magnaporthe; Mycosphaerella ; Phytophthora ; Plasmopara ; 
Pneumocystis; Pyricularia; Pythium; Puccinia; Rhizoctonia; 
Trichophyton; and Ustilago 

10 

The fungal infection may arise from . infection with an 
organism selected from the species Aspergillus flavus; 
Aspergillus fumigatus; Aspergillus nidulans ; Aspergillus 
niger; Aspergillus parasiticus; Aspergillus terreus; 

15 Blumeria graminis; Candida albicans; Candida cruzei; 
Candida glabrata; Candida parapsilosis ; Candida 
tropicalis ; Colletotrichium trifolii; Cryptococcus 
neoformans; Encephalitozoon cuniculi; Fusarium 

graminarium; Fusarium solani; Fusarium sporotrichoides ; 

20 Leptosphaeria nodorum; Magnaporthe grisea; Mycosphaerella 
graminicola; Phytophthora capsici; Phytophthora infestans ; 
Plasmopara viticola; Pneumocystis j iroveci; Puccinia 
coronata ; Puccinia graminis ; Pyricularia oryzae; Pythium 
ultimum; Rhizoctonia solani; Trichophyton interdigitale; 

25 Trichophyton rubrum; and Ustilago maydis . 

Preferably the sample comprises a biological sample 
which, preferably, comprises nucleic acid and/or protein. 
In one embodiment of the method the nucleic acid or 
30 protein is purified (at least partially) from the sample 
before the detection is performed- 
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Where the organism is Aspergillus fumlgatus r Aspergillus 
nidulans or Aspergillus niger, the sample may comprise 
sputum, bronchoalveloar lavage, urine," respiratory 
specimens, endotracheal aspirates, sterile specimens 
5 obtained by an invasive procedure such as vitreous tap, 
tympanocentesis , brain biopsy or aspiration, nasal or 
sinus specimens, blood, tissue or autopsy. 

Where the organism is Magnaporthe grisea the sample may 
10 comprise rice leaf or rice stem. 

Preferably, said detecting of the presence in the said 
sample of a polynucleotide as defined by the first or 
third aspect comprises use of at least one oligonucleotide 
15 ' pair adapted to be used for amplification of DNA, 
preferably genomic, more preferably, fungal genomic DNA, 
The amplification may be PGR amplification. 

Preferably, the PGR amplification employs at least one 
20 primer pair comprising a polynucleotide selected from the 
group consisting of: 

Aspergillus fumigatus; SEQ ID Nos 67 and 68 for SEQ ID No. 
1; SEQ ID Nos 69 and 7 0 for SEQ ID No. 4; and SEQ ID Nos 
25 71 and 72 for SEQ ID No. 7. 

Candida albicans; SEQ ID Nos 73 and 74 for SEQ ID No. 11. 
Magnaporthe grisea; SEQ ID Nos 7 5 and 7 6 for SEQ ID No. 
20. 

30 Preferably, said detecting comprises subjecting the 
amplified DNA to size analysis, preferably, 
electrophoresis and, preferably, comparing the results to 
a positive control and, preferably, a negative control. 



28 



Said detecting may also comprise sequencing of the 
amplified DNA to demonstrate the correct sequence - 

Preferably, said detecting of the presence in the said 
5 sample of a protein . comprises use of a monoclonal or 
polyclonal antibody directed to part or all of the protein 
of the invention. 

According to a further aspect of the present invention, 
10 there is provided a recombinant DNA molecule or vector 
comprising a polynucleotide of the invention. 

The recombinant DNA molecule or vector may comprise an 
expression cassette. Preferably, the recombinant DNA 
15 molecule or vector comprises an expression vector. 
Preferably, the polynucleotide sequence is operatively 
linked to an expression control sequence. A suitable 
control sequence may comprise a promoter, an enhancer etc. 

20 According to another aspect of the present invention, 
there is provided a cell containing a polynucleotide, 
recombinant DNA molecule or vector of the invention. 

The cell may be transformed or transfected with the 
25 polynucleotide, recombinant DNA molecule or vector by 
suitable means. Preferably, the cell produces a 

recombinant protein of the invention - 

The invention also provides an organism which is 
30 transgenic for the polynucleotide of the invention (whose 
cells may be the same as the cells of the invention 
mentioned herein) . Such an organism is typically a 
fungus, such as any genera or species of fungus mentioned 
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herein. The organism may be microorganism, such as a 
bacterium, virus or yeast. The organism may be a plant, 
. animal (including birds and mammals) , such as any of the 
animals mentioned herein. 

5 

The organism may be produced by introduction of the 
polynucleotide of the invention into a cell of the organism, 
and in the case of a multicellular organism allowing the cell 
to grow into a whole organism. 

10 

According to a further aspect of the present invention, 
there is provided a cell in which a native polynucleotide 
or protein of the invention protein is non-functional 
and/or inhibited. The cell may be of, or present in, a 
15 multicellula,r organism. 

The cell may be a mutant cell. The cell is typically a fungal 
cell, such as of ' any genera or species of fungus mentioned 
herein. A preferred means of generating the cell is to modify 

20 the polynucleotide of the invention, such that the 
polynucleotide is non-functional. This modification may be to 
cause a mutation, which disrupts the expression or function 
of a gene product. Such mutations may be to the nucleic acid 
sequences that act as 5' or 3' regulatory sequences for the 

25 polynucleotide, or may be a mutation introduced into the 
coding sequence of the polynucleotide. Functional deletion of 
the polynucleotide may be, for example, by mutation of the 
polynucleotide in the form of nucleotide substitution, 
addition or, preferably, nucleotide deletion. 

30 

The polynucleotide may be made non-functional and/or 
inhibited by: ' 
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(i) shifting the reading frame of the coding sequence 
of the polynucleotide; 

(ii) adding;, substituting or deleting amino acids in 
the protein encoded by the polynucleotide; or 

5 (iii) partially or entirely deleting the DNA coding for 
the polynucleotide and/or the upstream and downstream 
regulatory sequences associated with the polynucleotide, 
(iv) inserting DNA into the coding or non-coding 
regions . 

10 

A preferred means of introducing a mutation into a 
polynucleotide is to utilize molecular biology techniques 
specifically to target the polynucleotide which is to be 
mutated. Mutations may be induced using a DNA molecule- A 

15 most preferred means of introducing a mutation is to use a 
DNA molecule that has been especially prepared such that 
homologous recombination occurs between the target 
polynucleotide and the DNA molecule. When this is the 
case, the DNA molecule, which may be double stranded, may 

20 contain base sequences similar or identical to the target 
polynucleotide to allow the DNA molecule to hybridize to 
(and subsequently recombine with) the target . 

It is also possible to provide a cell in which the 
25 polynucleotide is non-functional and/or inhibited without 
introducing a mutation into the gene or its regulatory 
regions. This may be done by using specific inhibitors. 
Examples of such inhibitors include agents that prevent 
transcription of the polynucleotide, or prevent translation, 
30 expression or disrupt post-translational modification. 
Alternatively, the inhibitor may be an agent that increases 
degradation of the gene product (e.g. a specific proteolytic 
enzyme) . Equally, the inhibitor may be an agent which 
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prevents the polynucleotide product from functioning^ such as 
neutralizing antibodies (for instance an anti~2031 OR 
antibody) . The inhibitor may also be an antisense 
oligonucleotide, or any synthetic chemical capable of 
5 inhibiting expression of the gene or the stability and/or 
function of the protein. The inhibitor may also be a protein 
which interacts with the 2 031 OR to prevent its function. The 
inhibitor may also be an RNA molecule which causes inhibition 
by RNA interference. In one embodiment the antisense 
10 polynucleotide or RNA molecule which causes RNA interference 
are examples of polynucleotides of the invention. 

According to a further aspect, there is provided an 
antibody exhibiting immunospecif icity for a protein of the 
15 invention. The antibody may be used as a diagnostic 
reagent . 

The antibody may be monoclonal or polyclonal, and may be 
raised in mouse, rat, rabbit, chicken, turkey, horse, goat 
20 or donkey. The antibody may be raised against one or all 
of the proteins together, or may be raised against 
proteolytic or recombinant fragments . 

For the purposes of this invention, the term ^^antibody", 
25 unless specified to the contrary, includes • fragments which 
bind a prptein of the invention. Such fragments include 
Fv, F{ab') and F{ab')2 fragments, as well as single chain 
antibodies. Furthermore, the antibodies and fragment 
thereof may be chimeric antibodies, CDR-grafted antibodies 
30 or humanised antibodies . 
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Administration 

The formulation of any of the therapeutic substances (e.g. 
proteins / polynucleotides or modulators) mentioned herein 
will depend upon factors such as ' the nature of the 
5 substance and the condition to be treated. Any such 
substance may be administered in a variety of dosage 
forms- It may be administered orally (e.g. as tablets, 
troches, lozenges, aqueous or oily suspensions, 
dispersible powders or granules), parenterally, 
10 subcutaneously, intravenously, intramuscularly, 

intrasternally, transdermally or by infusion techniques. 
The substance may also be administered as suppositories. 
A physician will be able to determine the required route 
of administration for each particular patient, 

15 

Typically the substance is formulated for use with a 
pharmaceutically acceptable carrier or diluent . The 
pharmaceutical carrier or diluent may be, for example, an 
isotonic solution. For example, solid oral forms may 

20 contain, together with the active compound, diluents, e.g. 
lactose, dextrose, saccharose, cellulose, corn starch or 
potato starch; lubricants, e.g. silica, talc, stearic 
acid, magnesium or calcium stearate, and/or polyethylene 
glycols; binding agents; e.g. starches, arable gums, 

25 gelatin, methylcellulose, carboxymethylcellulose or - 
polyvinyl pyrrolidone; disaggregating agents, e.g. starch, 
. alginic acid, alginates or sodium starch glycolate; 
effervescing mixtures; dyestuffs; sweeteners; wetting 
agents, such as lecithin, polysorbates , laurylsulphates ; 

30 and, in general, non-toxic and pharmacologically inactive 
substances used in pharmaceutical formulations. Such 
pharmaceutical preparations may be manufactured in known 
manner, for example, by means of mixing, granulating. 
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tabletting, sugar-coating, or film coating processes. 

Liquid dispersions for oral administration may be syrups, 
emulsions and suspensions . The syrups may contain as 
5 carriers, for example, saccharose- or saccharose with 
glycerine and/or mannitol and/or sorbitol. Suspensions 
and emulsions may contain as carrier, for example a 
natural gum, agar, sodium alginate, pectin, 
methylcellulose , carboxymethylcellulose , or polyvinyl 

10 alcohol. The suspensions or solutions for intramuscular 
injections may contain, together with the active compound, 
a pharmaceutically acceptable carrier, e.g. sterile water, 
olive oil, ethyl oleate, glycols, e.g. propylene glycol, 
and if desired, a suitable amount of lidocaine 

15 hydrochloride - 

Solutions for intravenous or infusions may contain as 
carrier, for example, sterile water or preferably they may 
be ill the form of sterile, aqueous, isotonic saline 
20 solutions - 

A therapeutically effective non-toxic amount of substance 
is administered. The dose may be determined according to 
various parameters, especially according to the substance 

25 used; the age, weight and condition of the patient to be 
treated; the route of administration; and- the required 
regimen. Again^ a physician will be able to determine the 
required route of administration and dosage for any 
particular patient- A typical daily dose is from about 

30 0.1 to 50 mg per kg, preferably from about O.lmg/kg to 
lOmg/kg of body weight, according to the activity of the 
specific inhibitor, the age, weight and conditions of the 
subject to be treated, the type and severity of the 
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disease and the frequency and route of administration. 
Preferably, daily dosage levels are from 5 mg to 2 g. 

Agricultural use 

Modulators identified by the method of the invention may 
be administered to plants in order to prevent or treat 
fungal infections . The modulators are normally applied in 
the form of compositions together with one or more 
10 agriculturally acceptable carriers or diluents and can be 
applied to the crop area or plant to be treated, 
simultaneously or in succession with further compounds. 

The modulators of the invention can be applied together 
15 with carriers, surfactants or application-promoting 
adjuvants customarily employed in the art. of formulation- 
Suitable carriers and diluents correspond to substances 
ordinarily employed in formulation technology, e.g. 
natural or regenerated mineral substances, solvents, 
20 dispersants, wetting agents, tackifiers, binders or 
fertilizers . 

A preferred method of applying the modulators of the 
present invention or an agrochemical composition which 

25 contains them is leaf application. The number of 

applications and the rate of application depend on the 
intensity of infection by the fungus. However, the active 
ingredients can also penetrate the plant through the roots 
via the soil (systemic action) by impregnating the locus 

30 of the plant with a liquid composition, or by applying the 
compounds in solid form to the soil, e.g. in granular form 
{soil application) . The active ingredients may also be 
applied to seeds (coating) by impregnating the seeds 
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either with a liquid formulation containing active 
ingredients, or coating " them with a solid formulation. In 
special cases, further types of application are also 
possible, for example, selective treatment of the plant 
5 stems or buds . 

The active ingredients are used in unmodified form or^ 
preferably, together with the adjuvants conventionally 
employed -in the' art of formulation, and are therefore 

10 formulated in known manner to emulsifiable concentrates, 
coatable pastes, directly sprayable or dilutable 
solutions, dilute emulsions, wettable . powders, soluble 
powders, dusts, granulates, and also encapsulations, for 
example, in polymer substances. Like the nature of the 

15 compositions, the methods of application, such as 
spraying, atomizing, dusting, scattering or pouring, are 
chosen in accordance with the intended objectives and the 
prevailing circumstances. Advantageous rates of 

application are normally from 50g to 5kg of active 

20 ingredient (a.i.) per hectare (^^ha'', approximately 2.471 
acres), preferably from lOOg to 2kg a.i. /ha, most 
preferably from 200g to SOOg a.i. /ha. 

The formulations, compositions or preparations containing 
25 the active ingredients and, where appropriate, a solid or 
liquid adjuvant, are prepared in known manner, for example 
by homogeneously mixing and/or grinding active ingredients 
with extenders, for example solvents, solid carriers and, 
where appropriate, surface-active compounds (surfactants) . 

30 

Suitable solvents include aromatic hydrocarbons, 
preferably the fractions having 8 to 12 carbon atoms, for 
example, xylene mixtures or substituted naphthalenes. 
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phthalates such as dibutyl phthalate or dioctyl phthalate, 
aliphatic hydrocarbons such as cyclohexane or paraffins, 
alcohols and glycols and their ethers and esters, such as 
ethanol, ethylene glycol, monomethyl or monoethyl ether, 
5 ketones such as cyclohexanone, strongly polar solvents 
such as N-methyl-2-pyrrolidone, dimethyl sulfoxide or 
dimethyl formamide, as well as epoxidized vegetable oils 
such as epoxidized coconut oil or soybean oil; or water - 

10 The solid carriers, used e.g. for dusts and dispersible 
powders, are normally natural mineral fillers such as 
calcite, talcum, kaolin, montmorillonite or attapulgite. 
In order to improve the physical properties it is also 
possible to add highly dispersed silicic acid or highly 

15 dispersed absorbent polymers . Suitable granulated 

adsorptive carriers are porous types, for example pumice, 
broken brick, sepiolite or bentonite; and suitable 
nonsorbent carriers are materials such as calcite or sand. 
In addition, a great number of pregranulated materials of 

20 inorganic or organic nature can be used, e.g. especially 
dolomite or pulverized plant residues. 

Depending on the nature of the active ingredient to be 
used in the formulation, suitable surface-active compounds 
25 are nonionic, cationic and/or anionic surfactants having 
good emulsifying, dispersing and wetting properties. The 
term ''^surfactants" will also be understood as comprising 
mixtures of surfactants . 

30 Suitable anionic surfactants can be both water-soluble 
soaps and water-soluble synthetic surface-active 
compounds. Suitable soaps are the alkali metal salts, 
alkaline earth metal salts or unsubstituted or substituted 
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ammonium salts of higher fatty acids (chains of 10 to 22 
carbon atoms) , for example the sodium or potassium salts 
of oleic or stearic acid, or of natural fatty acid 
mixtures which can be obtained for example from coconut 
5 oil or tallow oil. The fatty acid methyltaurin salts may 
also be used. 

More frequently, however, so-called synthetic surfactants 
are used, especially fatty sulfonates, fatty sulfates, 

10 sulfonated benzimidazole derivatives or 

alkylarylsulf onates . The fatty sulfonates or sulfates are 
usually in the form of alkali metal salts, alkaline earth 
metal salts or unsubstituted or substituted ammoniums 
salts and have a 8 to 22 carbon alkyl radical which also 

15 includes the alkyl moiety of alkyl radicals, for example, 
the sodium or calcium salt of lignonsulf onic acid, of 
dodecylsulf ate or of a mixture of fatty alcohol sulfates 
obtained from natural fatty acids . These compounds also 
comprise the salts of sulfuric acid esters and sulfonic 

20 acids of fatty alcohol/ethylene oxide adducts , The 
sulfonated benzimidazole derivatives preferably contain 2 
sulfonic acid groups and one fatty acid radical containing 
8 to 22 carbon atoms. Examples of alkylarylsulf onates are 
the sodium, calcium or triethanolamine salts of 

25 dodecylbenzenesulf onic acid, dibutylnaphthalenesulf onic 
acid, or of a naphthalenesulf onic acid/formaldehyde 
condensation product. Also suitable are corresponding 
phosphates, e.g. salts of the phosphoric acid ester of an 
adduct of p-nonylphenol with 4 to 14 moles of ethylene 

30 oxide. 

Non-ionic surfactants are preferably polyglycol ether 
derivatives of aliphatic or cycloaliphatic alcohols, or 
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saturated or unsaturated fatty acids- and alkylphenols, 
said derivatives containing 3 to 30 glycol ether groups 
and 8 to 20 carbon atoms in the (aliphatic) hydrocarbon 
moiety and 6 to 18 carbon atoms in the alkyl moiety of the 
5 alkylphenols . 

Further suitable non-ionic surfactants are the water- 
soluble adducts of polyethylene oxide with polypropylene 
glycol,. ethylenediamine propylene glycol and 

10 alkylpolypropylene glycol containing 1 to 10 carbon atoms 
in the alkyl chain, which adducts contain 20 to 250 
ethylene glycol ether groups and 10 to 100 propylene 
glycol ether groups. These compounds usually contain 1 to 
5 ethylene glycol units per propylene glycol unit- 

15 

Representative examples of non-ionic surfactants are 
nonylphenolpolyethoxyethanols , castor oil polyglycol 
ethers, polypropylene/polyethylene oxide adducts, 
tributylphenoxypolyethoxyethanol , polyethylene glycol and 
20 octylphenoxyethoxyethanol . Fatty acid esters of 

polyoxyethylene sorbitan and polyoxyethylene sorbitan 
trioleate are also suitable non-ionic surfactants . 

Cationic surfactants are preferably quaternary ammonium 
25 salts which have, as N-substituent, at least one Cs-Caa 
alkyl radical and, as further substituents, lower 
unsubstituted or halogenated alkyl, benzyl or lower 
hydroxyalkyl radicals- The salts are preferably in the 
form of halides, methylsulf ates or ethylsulf ates , e.g. 
30 stearyltrimethylammonium chloride or benzyldi{2- 

chloroethyl) ethylammonium bromide. 

The surfactants customarily employed in the art of 
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formulation are described, for example ;r in ^^McCutcheon' s 
Detergents and Emulsifiers Annual'';. MC Publishing Corp. 
Ringwood, New Jersey;. 197 9, and Sisely and Wood, 
^^Encyclopaedia of- Surface Active Agents,'' Chemical 
5 Publishing Co., Inc. New York, 1980. 

The agrochemical compositions usually contain from about 
0.1 to about 99% preferably about 0,1 to about 95%, and 
most, preferably from about 3 to about 90% of the active 

10 ingredient, from about 1 to about 99.9%, preferably from 
about 1 to 99%, and most preferably from about 5 to about 
95% of a solid or liquid adjuvant, and from about 0 to 
about 25%, preferably about 0.1 to about 25%, and most 
preferably from about 0.1 to about 20% of a surfactant. 

15 Whereas commercial products are preferably formulated as 
concentrates, the end user will normally employ dilute 
formulations . 

All of the features described herein may be combined with 
20 any of the above aspects, in any combination. 

Embodiments of the invention will now be described by way 
of example, with reference to the accompanying drawings in 
which 

25 

Figure 1 illustrates a multiple sequence alignment of 
amino acid sequences corresponding to fungal and bacterial 
2031 and OYE family oxidoreductases; 

30 Figure 2 illustrates a multiple sequence alignment of 
nucleic acid sequences corresponding to fungal 2031 and 
family oxidoreductases; 
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Figure 3A illustrates the expression of recombinant 2031 
OR; B shows purified' recombinant 2031 OR. 

Figure 4 . Phylogenetic tree showing relationships between 
S A. fumlgatus .2031 OR and similar proteins. This 
demonstrates a 2031 OR clade, which can be distinguished 
from the OYE proteins; 

Figure 5 illustrates reduction of a range of substrates by 
10 recombinant 2031 OR. 

EXAMPLES 

Example 1. Identification of an essential gene in 
15 Aspergillus fumlgatus 

An essential region of the A. fumlgatus genome was 
identified using the mycobank technology as described in 
patent WO00177295A1 with the following modifications: 

20 . 

Re-haploidisation (section 1.6): 

P24 lines 11-18: Conidia (A. fumlgatus) were collected 
from a stable diploid trans formant colony and 
approximately 3x10^ spores were used to inoculate 1 ml of 

25 SAB broth containing Img/ml. FPA. This culture was 
incubated , with shaking (200 rpm) at 37*^0 for 20 hours. 
100|ll1 of the culture was spread onto complete media 
containing 0.2 mg/ml FPA and incubated at 37 °C for 3 
days or until rapidly growing sectors emerged. Conidia 

30 were collected from each sector and plated onto nitrate, 
nitrite and hypoxanthine media and the nitrogen 
utilisation profiles of the resulting conidia assessed. 
Colonies with the nitrogen utilisation profiles of the 
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parental strains indicated breakdown of the diploid to a 
haploid. 44 haploid sectors were isolated from 
transf ormant 2031. None of the haploids isolated were 
hygromycin resistant indicating the insertion of the hph 
5 gene into a portion of the genome required for function- 
Trans formation (section 1.7): 

P25 line 9: Plasmid pAN7-l linearised with Hindlll was 
used as the transforming vector. PAN7-1 carries the hph 

10 gene which confers hygromycin resistance, 

P25 lines 17-20: 1 ml of cold YED was added to the 
cuvette and incubated at 37 °C for 1 h, Aliquots were 
spread on selective agar (complete media with 250 |LLg/ml 
hygromycin) , Colonies .growing on selective media were 

15 deemed putative transf ormant s . 

The point of insertion was identified using the plasmid 
rescue method outlined on page 31 lines 5-17. The 
insertion site was confirmed by employing PGR: Using the 
,20 sequence obtained from plasmid rescue data a primer was 
designed within the sequence of pAN7-l and a complementary 
primer was designed within the predicted sequence near the 
point of insertion. Genomic DNA isolated from the diploid 
2031 was used as a template. 

25 

The resulting DNA sequence (experiment 2031, with 175 
bases of upstream pAN7 , 1 sequence removed) corresponds to 
the gDNA sequence immediately downstream of the insertion 
site and is given as SEQ ID No. 45. 



30 



42 



Example 2, Characterisation of the essential gene 
2.1 Genome analysis 

The TIGR A. fumigatus database (www-TIGR.org) was searched 
(blastn) with the sequence SEQ ID No. 45, identified in 
Example 1 above, and a match to contig 4798 (Eval 4 . 6e- 
148) was identified. The appropriate region of the contig 
sequence was down-loaded from www, tigr.org and gene 
predictions carried out using Genscan 

(genes .mi t-edu/GENSCAN. html; Settings; organism = 
vertebrate; SuboptdLmal exon cutoff == 1.00). 

The ab Initio prediction of genes from genomes is known to 
be an inaccurate process (Burset, M. and Guigo, 1996, 
Genomics, 34, 353-3 67) and this is particularly so when 
the programs used have not been specifically trained for 
the genome under examination (as is the case here) . It is 
therefore necessary to carefully examine the predictions, 
to compare any predicted genes with any homologous 
proteins, and to exploit the operative's knowledge of 
fungal gene structure, and thus to arrive at an infoirmed 
prediction. The predicted genes were therefore compared 
with similar sequences using blastp (http:// 
blast.genome.ad.jp/), the multiple alignment program 
ClustalX (Thompson et al , , 1997, Nucleic Acids Research, 
24:4876-4882), and the alignment editor/ viewer Align 
(http:// www, gwdg.de/~dhepper/download/; Hepperle, D., 
2001: Multicolor Sequence Alignment Editor. Institute of 
Freshwater Ecology and Inland Fisheries, 16775 Stechlin, 
Germany) . Gene structures were visualised and modified 
using Artemis (http : //www. s anger , ac.uk/Software/Artemis/; 
Rutherford et al., 2000, Bioinf ormatics 16, 944-945). 
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The gene adjacent to the insertion site corresponded to 
bases 299-469 (exon 1) and bases 520-1618 (exon 2) of the 
genomic sequence given as SEQ ID No. 1. The protein 
sequence for the gene is given as SEQ ID No. 3. The 
5 insertion site was 735 bases upstream of the 5' ATG start 
of the gene . 

Searches of the protein databases at 

http: //blast • genome . ad- jp/ showed that protein SEQ ID No. 

10 3 is a member of the NADH-dependent flavin oxidoreductase 
family. This protein is henceforth referred to as 2031 
oxidoreductase (2031 OR; having come from mycobank 
experiment 2031) . Other 2031 OR-like proteins were also 
identified (see Example 4.1). The NADH-dependent flavin 

15 oxidoreductase family also includes Old Yellow Enzyme 
(OYE) , from S. cerevlsiae and other fungi, although 2 031 
ORs can be distinguished from OYEs . 

Referring to. Figures 1, there is shown a multiple 
20 alignment of the 2031 OR amino acid sequence from A. 
fumlgatus along with related ORs from other fungi and 
bacteria (see also Example 4) . Regions 1-11 refer to amino 
acids conserved between ORs . 

25 Fungal 2031 ORs are given by: SEQ ID Nos . 3, 6 and 8, A. 
fumlgatus; SEQ ID No. 10, A.nidulans; SEQ ID Nos. 12 and 
14, C. albicans; SEQ ID Nos. 16 and 19, N. crassa; SEQ ID 
Nos 22 and 44, M. grlsea; SEQ ID No. 24, (NP_595868), 5. 
pombe; SEQ ID No. 27, C. trifolll; SEQ ID Nos. 30, 33 and 

30 35, F. sporotrlchloides; SEQ ID Nos. 38 and 83, F, 
grazninearumSEQ ID Nos. 40 and 42, M. gramlnlcola; SEQ ID 
No. 85, U, maydls . 
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Bacterial ORs resembling 2031 are: T44612 {Pseudomonas 
put Ida) } NP_625402 {Streptomyces coelxcolor) ; NP_295913 
[Delnococcus radlodurans) ; AF320254 {Azoarcus evansll) - 

5 Fungal ORs similar to- the Old Yellow Enzyme family 
(originally identified in 5, cerev'lsiae) : Af4875 and 
Af4961, A. fumlgatus} Ca2460 and A36990, C. albicans; 
Nc4452, crassa; OYEl^ 0YE2 and 0YE3, S. cerev^lslae; . 

10 Details of the sequence searches that identified the ORs 
other than SEQ ID No. 3, and methods for the construction 
of multiple alignments are given in Example 4hereinafter . 

Referring to Figure 2, there is shown a multiple alignment 
15 of the nucleotide sequence of 2031 OR from A. fumlgatus 
along with related 2031 ORs from other fungi and bacteria 
(see also Example 4) . Regions 1-11 refer to amino acids 
conserved between 2031 ORs at the amino acid level. 
Fungal 2031 ORs are given by SEQ ID No.: SEQ ID Nos . 1, 2, 
20 4, 5, and 7, A. fumlgatus; SEQ ID No. 9, A.nldulans; SEQ 
ID Nos. 11 and 13, C. albicans; SEQ ID Nos. 15, 17 and 18, 
N. crassa; SEQ ID Nos. 20, 21 and 43, M. grlsea; SEQ ID 
No. 23 (NP_595868), S. pombe; SEQ ID Nos. 25 and 26, C. 
trlfolll; SEQ ID Nos. 28, 29, 31, 32 and 34, F. 
25 sporotrlchloldes; SEQ ID Nos. 36, 37 and 82, F. 
gramlnearum; SEQ ID Nos. 39 and 41, M. gramlnlcola; SEQ ID 
No. 84, U. maydls . 

Details of the sequence searches that identified the ORs, 
and methods for the construction of multiple alignments 
30 are given in Example 41 hereinafter. 
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2.2 Genomic Sequencing of Genes 

Following the above ' bioinf ormatic analyses, the genomic 
sequences of 2031 OR was experimentally determined. 

5 2.2.1 Bacterial and Fungal Strains 

For bacterial cloning, E. coll strains ToplO (Invitrogen) 
and select96 (Promega) were used in accordance with 
manufacturers' instructions. 

10 A. fumlgatus cl'inical isolate AF293 (ref. No, NCPF7367; 
available to the public from the NCPF repository; Bristol, 
U.K.); the CBS repository (Belgium) or from- Dr. David 
Denning' s clinical isolate culture collection, Hope 
Hospital, Salford. U.K.) is the preferred strain according 

15 to the present invention. AF293 Was isolated in 1993 from 
the lung biopsy of - a patient with invasive aspergillosis 

« 

and aplastic anaemia. It was donated by Shrewsbury PHLS. 

2.2.2 Purification of A. fumlgatus genomic DMA 
20 To obtain mycelial material for genomic DNA isolation, 
approximately 10^ A. fumlgatus conidia were inoculated in 
50 ml of Vogel's minimal medium and incubated with shaking 
at 200 rpm until late exponential phase (18-24 h) at 37°C. 
Mycelium was dried down onto Whatmann 54 paper using a 
25 Buckner funnel and a side-arm flask attached to a vacuum 
pump and washed with PBS/Tween. At this point, the 
mycelium could be freeze-dried for extraction at a later 
date . 

30 The mycelium (fresh or freeze dried) was ground to a 
powder using liquid nitrogen in a --2 0°C cooled mortar. The 
ground biomass was transferred to 50 ml tubes on ice up to 
the 10 ml mark- An equal volume of extraction buffer (0.7 
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M NaCl; 0.1 M Na2S03; 0.1 M Tris-HCl pH 7.5; 0.05 M EDTA; 
l%(w/v) SDS; pre-warmed to 65°C) was then added to each 
tube, mixed thoroughly with a pipette tip and incubated at 
65°C for 20 minutes in a water bath- A volume of 
5 chloroform/isoamyl alcohol (24:1) equivalent to the volume 
of the original biomass was then added to each tube, tubes 
were mixed thoroughly and incubated on ice for 30 min. 
Tubes were then centrifuged at 3,500 x g for 30 min and 
the aqueous phase carefully transferred to fresh 50 ml 
10 tubes without disturbing the interface . 

An equal volume of chloroform/isoamyl alcohol (24:1) was 
added, the tubes vortexed and incubated on ice for 15 
minutes. Tubes were then spun at 3,500 x g for 15 minutes. 

15 After this spin, if large amounts of precipitate were 
still present, the supernatant * was removed and the 
chloroform lisoamyl alcohol step repeated. The supernatant 
was removed and placed in clean sterile Oak Ridge tubes - 
An equal volume of isopropanol was added and mixed gently. 

20 Tubes were incubated at room temperature for at least 15 
minutes. Tubes were then centrifuged at 3,030 x g for 10 
minutes at 4*^C to pellet the DNA. The supernatant was 
. removed and^ the pellet allowed to air dry for 10-25 
minutes. The pellet was suspended in 2 ml sterile water. 1 

25 ml of 7.^5 M ammonium acetate was added, mixed and 
incubated on ice for 1 hour. Tubes were centrifuged at 
12,000 X g for 30 min, the supernatants transferred to a 
fresh tube and 0.54 volumes of isopropanol were added, 
mixed and incubated at room temperature for at least 15 

30 minutes. Tubes were then centrifuged at 5,930 x g for 10 
min, the supernatant was removed and the pellet washed in 
1 ml of 70% ethanol. Tubes were centrifuged at 5,930 x g 
for 10 min and all the ethanol was removed. The pellet was 



47 



air dried for 2 0-30 minutes at room temperature and 
suspended in 0.5-1.0 ml of TE (10 mM Tris-HCl pH 7.5; ImM 
EDTA) Finally, the DNA was treated with RNase A (5 lal of 
Img/ml stock) • 

5 

2.2.3 PCR Reactions 

Primers were designed to the upstream and downstream 
regions of the A. fumigatus AF293 2031 OR; cloning primer 
pair SEQ ID Nos - 46 (Ox9_for) and 47 (Oxl0_rev) . The 

10 following reagents and conditions were used: 



PCR Master Mix 

lOx high fidelity PCR buffer 5 |li1 

dNTP (clontech: lOmM) 1 p,l 

15 nHzO 39 |Lil 

Pfu Ultra Polmerase {2-5U/|ll1) 1 |li1 

Forward primer (Ox9__for: 10 pmol/p.1 stock) 1 |li1 

Reverse primer (OxlO_rev: 10 pmol/jLil stock) 1 jal 

gDNA (1:30 dilution of stock) 2 jliI 



20 

PCR Cycle 

1) 95° C 2 min 

2) 95° C 30 sec 

3) 54° C 30 sec 
25 4) 72° C 2 min 

5) 72° C ' 10 min 

6) 8° C. Hold 

40 cycles of steps 2-4 were carried out and the PCR 
30 products were run on a gel. The product band (1.9kb) was 
excised from the gel and purified using Qiagen's QIAquick 
Gel Extraction Kit (Qiagen Ltd, Boundary Court, Gatwick 
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Road, Crawley, West Sussex, RHIO 9AX, UK) according to the 
manufacturers instructions and eluted into 30 p.1 of 
sterile * water (BDH molecular biology grade/filter 
sterile) . 
5 . • 

2.2.4 Genomic DNA Cloning and Sequencing- 

Since the gDNA was amplified using Pfu ultra polymerase 
which produces blunt ends it was necessary to add ^A' 
overhangs before ligating in to pGEM Teasy. 12.5 )Lx1 of 
10 purified PGR product was incubated with 12.5 |li1 2x PGR 
Reddy Mix (ABGene) 12,5 jal at 70"* C for 30 minutes. The 
sample was then purified using Qigen Qiaquick gel 
extraction kit and eluted in 30 |j.1 of molecular biology 
grade water. 

15 

The PGR product was then ligated into pGEM-Teasy (Promega) 
using the following ligation mixture: 

2x Buffer 5 [xl 

20 pGEM Teasy 1 \xl 

PGR product 3 |li1 

T4 DNA Ligase 1 |U.l 

The reaction was incubated over-night at 4"" G. 

25 

2 1^1 of the ligation mix were then added to Select 96 
cells (Promega) and incubated for 20 min on ice. Gells 
were then heat shocked at 42° C for 45 sees and placed 
back on ice. 250 ml of room temp. SOG medium was then 
30 added and the cells incubated for 1 hour at 37° G, with 
shaking at 220 rpm. 50 and' 200 |li1 amounts were then plated 
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on to LB agar plates containing ampicillin (100 ]ig/ml) f 50 
pi X-gal (4%) and 10 jal IPTG (100 mM) and incubated over 
night at 37° C. 

5 Individual white colonies were picked from each 
transformation inoculated into LB with ampicillin (100 
lj.g/ml) and incubated over-night at 37 C, with shaking at 
220 rpm. Plasmid DNA was extracted using Qiagen miniprep 
kit according to the manufacturers instructions, 1 |al of 

10 plasmid DNA was digested with EcoRI for 1 hour at 37° C. 
Fragment sizes are calculated to be 3Kb and 1.6Kb for gDNA 
and 3Kb and 1.2 Kb for cDNA. Clones showing the correct 
restriction digest pattern were sequenced at MWG Biotech 
UK Ltd, Waterside House, Peartree Bridge, , Milton Keynes, 

15 MK6 3BY- The experimentally determined sequence of 2 031 OR 
was identical in the coding regions to that identified by 
bioinf ormatic analyses (Example 2) . 

Example 3- cDNA sequencing and RACE for 2031 OR 
20 The internal sequence of the 2031 OR message was 
experimentally determined by cloning and sequencing cDNA, 
and the 5' and 3' ends of the gene were determined by RACE 
(Rapid Amplification of cDNA Ends) . 

25 3 . 1 cDNA cloning and sequencing 

3.1.1 Preparation of A. fumlgatus RNA and' cDNA 

Fungal cultures were prepared as described in Example 
2.2.2. Cultures were harvested by filtration, then washed 
twice with DEPC- treated water and transferred to a 50ml 
30 Falcon tube. Samples were frozen in liquid nitrogen and 
stored at -SCC until required. 
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To prepare RNA, fungal samples were ground to a fine 
powder under liquid nitrogen. RNA was then extracted using 
the Qiagen RNeasy Plant Mini Kit following the protocol 
for isolation of total RNA from filamentous fungi in the 
5 RNeasy Mini Handbook (06/2001, Pages 75-78, 
http : //www . qiagen, com/ literature/ 

handbooks/rna/rnamini/1016272HBRNY_062 001WW.pdf) • The 
following modifications were used: At step 3, RLC was used 
as the lysis buffer of choice; At step 1, the Rneasy 

10 column was incubated for 5 min at room temperature after 
addition of RWl; The optional step 9a was carried out; At 
step lb, 30]il RNase-free water was added, the samples 
incubated for 10 min ^ at room temperature, and then 
centrifuged; At step 11, the elution step was repeated to 

15 give a total volume of 60 \il RNA, 

DNA contamination • was removed from the RNA by the addition 
of Dnase, using 2 pi DNase per pg RNA, in the presence of 
lOX DNase buffer and incubating at 37 for 2h. DNase- 
2 0 treated RNA was cleaned up using the RNeasy Plant Mini Kit 
following the RNeasy Mini Protocol for RNA Cleanup (RNeasy 
.Mini Handbook 06/2001, pages 79-81) - 

To synthesise cDNA from the above RNA the following 
25 reaction mixture was prepared: lOOng-l^tg of DNA- free RNA, 
3\il oligo (dT) (100 ng/pl) , and DEPC-treated water to a 
total volume of 42 ]al . Samples were incubated in a heat 
block at 65°C for 5 min after which they were allowed to 
cool slowly to room temperature. Then 2]il Ultrapure dNTPs, 
30 l^il reverse transcriptase (Stratascript) and 5ial lOX 
reverse transcriptase reaction buffer (Stratascript) were 
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added. Samples were incubated at 42*^0 for Ih, denatured at 
90 °C for 5 min and then cooled on ice, 

3.1.2 Production of cDNA constructs 
5 PGR was carried out using the cDNA above to generate cDNA 
fragments using the primer pair SEQ ID No. 4 8 (Oxl_for) 
and SEQ ID No. 4 9 (Ox3_rev) . PGR reactions were carried 
out using the following reagents and conditions: 

10 PGR Master Mix 

lOx high fidelity PGR buffer 
dNTP (clontech: lOmM) 
MgS04 (50 mM) 
nH20 

15 Platiniom TAQ Polmerase (5U/|li1) 

Forward primer (Oxl_for: 10 pmol/|Lil stock) 
Reverse primer (Ox3_rev: 10 pmol/|Lil stock) 
cDNA 

20 PGR Gycle 



1) 


94° 
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5 min 


2) 


94° 
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30 sec 


3) 


53° 
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30 sec 


4) 


68° 
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90 sec 
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10 min 


6) 


8° 


G Pause 



Gycles 2-4 were run 40 times in total. The amplicon was 
1269 bp. The PGR products were purified using Qiagen' s 
30 QIAquick PGR Purification Kit (Qiagen Ltd, Boundary Court, 
Gatwick Road, Grawley, West Sussex, RHIO 9AX, UK) 
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according to the manufacturers instructions. The purified 
PGR products were examined on agarose gels . 

PGR products were ligated into pGEM-Teasy, used to 
5 transform Select 96 cells, and sequenced as described in 
2.2.4 above- The cDNA sequence obtained is given as bases 
115 - 1385 of SEQ ID No. 2. 

3,2 RACE 

10 To determine the 5' and 3' ends of the genes, RACE (Rapid 
Amplification of cDNA Ends) was carried out, using the 
GeneRacer*^ Kit (Invitrogen; cat. No. L1502-01) , 
essentially as per manufacturers instructions. 

15 3.2.1 Preparation of RNA 

A. fumigatus biomass was prepared as described in 2,2.2. 
RNA was prepared using the FastRNA kit (QBIOgene) 
following the manufacturer's • instructions (Revision 6030- 
999-1J05) with the following amendments: At step 1 40 mg 

20 of biomass was used per extraction; At step 2, samples 
were processed for 20 seconds at speed 5, incubated on ice 
for 3 minutes, and processed again for 2 0 seconds at speed 
5; At step 3 samples were centrifuged for 5 minutes; At 
step 5, 500 jLtl DIPS were added, mixed, and incubated at 

25 room temperature for 2 minutes , Samples were mixed again 
and incubated for a further 2 minutes; At step 6 two 
washes in 250 jj.1 SEWS were carried out; At step 7, the 
pellet was disolved in 50 |j.l SAFE buffer. 



30 3.2.2 RACE 

1 ]ig total RNA prepared as described above was de- 
phosphorylated in a. 10 pi reaction using 10 units of calf 
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intestinal phosphate (CIP) , 1 pi lOX CIP buffer and 40U 
RNaseOut^ (made up to 10 |li1 in DEPC water) at SO^'C for 1 
hour. Samples were then made up to 100 \xl with DEPC water 
and the RNA extracted with 100 |J.1 (25:24:1) 
5 phenol: chloroform: isoamyl alcohol, RNA was then 
precipitated by the addition of 2 ]il mussel glycogen 
(lOmg/ml), 10 p.1 3M sodium acetate, pH 5.2 and 220 ]il 95% 
ethanol and the sample frozen on dry . ice for 10 minutes. 
RNA was pelleted by centrifugation at 14,500 rpm for 20 
10 minutes at 4*^0, washed with 70% ethanol, air dried and re- 
suspended in 8 ]il DEPC water. 

De-phosphorylated RNA (7 y.1)- was de-capped in a 10 \il 
reaction with 0.5 U tobacco acid pyrophosphatase (TAP), 1 
15 }xl lOx TAP buffer and 40a RnaseOut^ for 1 hour at 37°C. 
RNA was extracted with phenol : chloroform and precipitated 
as above, and then re-suspended in 7 ]il DEPC-treated 
water. 

20 De-phosphorylated, de-capped RNA (T ]il) was added to the 
pre-aliquoted GeneRacer^ RNA Oligo (0.25 \ig) and incubated 
at 65°C for 5 minutes. A 10 ]il ligation reaction was then 
set up by the addition of 1 ]il lOx ligase buffer, 1 ]il 
lOmM ATP, 40U RnaseOut™ and 5U T4 RNA ligase and incubated 

25 at 37°C for 1 hour. RNA was extracted and precipitated as 
described * previously and re-suspended in 11 ]il DEPC- 
treated water. 

First-strand cDNA was prepared by the addition of 1 ]il 
30 GeneRacer^ Oligo dT primer and 1 \xl dNTP mix (lOmM each) 
to 10 ]il ligated RNA and incubated at 65°C for 5 minutes. 
The following reagents were added to the 12 ]il ligated RNA 
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and primer mix; 4 ]xl 5x first strand buffer, 2 lal 0 . IM 
DTT, 1 ]il RNaseOut™ and 1 \il SuperScripf^" II RT {200U/ijl1) 
and incubated first at 42°C for 50 minutes and then, to 
stop the reaction, at 7 0°C for 15 minutes- 2U RNase H was 
5 added to the reaction mix and incubated at 37°C for 20 
minutes . 

To amplify the S^'cDNA ends a 50 ]il PGR reaction was set up 
using 1 ]il of the RACE-ready cDNA prepared above, 1 ]il 

10 GeneRacer™ 5' primer, 1 ]il reverse gene-specific primer 
(SEQ ID No- 50; Ox6race_rev: 5 pmol/fj.1 stock), 1 \il dNTP 
solution (lOmM each) , 2 pi 50 mM MgS04, 5 ]il High Fidelity 
PGR buffer, 0.5 \Jil Platinum® Tag DNA Polymerase High 
Fidelity (5 U/|Lii) and 38.5 |Lii sterile water. Gycling 

15 parameters are given in Table II below - 

A second, nested PGR stage was then set up using 1 ]il of 
the RAGE cDNA from the first stage above, 1 \il Nested 5' 
primer (supplied with kit) , 1 ]il reverse gene-specific 
20 primer (SEQ ID No. 50; Ox6race_rev: 5 pmol/|J.l stock), 1 \il 
dNTP solution (10 mM each), 2 lal 50 mM MgS04, 5 \i± High 
Fidelity PGR buffer, 0.5 ill Platinum® Taq DNA Polymerase 
High Fidelity (5 U/|al) and 38.5 |li1 sterile water. Gycling 
parameters are given in Table II below. 

25 

To amplify 3' ends a 50 lal PGR reaction was set up using 1 
]xl of the RAGE-ready cDNA prepared above, 1 pi ' GeneRacer^ 
3' primer (10 pM) , 1 ]il forward gene-specific primer (SEQ 
ID No. 51; Ox7race_for: 5 pmol/|LLl stock), 1 ]il dNTP 
30 solution (10 mM each), 2 pi 50 mM MgS04, 5 ]xl High 
Fidelity PGR buffer, 0.5 [il Platinum® Taq DNA Polymerase 
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High Fidelity (5 U/|li1) and 38.5 jllI sterile water. Cycling 
parameters are given in Table II below: 

A second, nested PGR stage was then set up using 1 ]il of 
5 the 3' RACE cDNA from the first stage above, 1 ]il Nested 
3' primer (supplied with kit) , 1 ]il reverse gene- 
specific primer (SEQ ID No. 52; Ox8race_for: 5 pmol/|Lil 
stock), 1 ]il dNTP solution (lOmM each), 2 jil 50 itiM MgSOa, 
5 III High Fidelity PGR buffer, 0.5 p.1 Platinum® Taq DNA 
10 Polymerase High Fidelity (5U/|li1) and 38,5 |J,1 sterile 
water. Cycling parameters are given in Table II below. 



Table II. Cycling parameters for 5' and 3' RACE 
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5' and 3' RACE confirmed the predicted 5' ATG and 3' stop 
codon as well as giving the 5' and 3' untranslated regions 
shown as bases 1-114 and 1385 - 1921 of SEQ ID No, 2. The 
coding sequence for 2 031 OR thus determined was identical 
5 to that given as bases 299-469 and 520-1618 of the gDNA 
•gien as SEQ ID No. 1.. 

Example 4- Identification of other fungal 2031 ORs and 
10 related genes 

Homologs of A. fumigatus 2031 OR were identified in other 
fungi and bacteria by means of bioinf ormatics analysis , 
Sequences identified by bioinf ormatics can be used to 
15 design primers which in turn can be used in PGR to 
generate DNA coding for the 2031 OR homolog. 

Alternatively, degenerate PGR can be used to obtain 
sequence for novel genes, which can then be used to 

20 generate probes for screening cDNA or genomic libraries of 
the organism of interest to identify clones containing the 
2031 OR homolog. As a further alternative. Southern blots 
using fragments of genes from one species as probes can 
be used to identify the presence of a homolog in the 

25 genome of a second species- The same probe can then be 
used to screen cDNA or genomic DNA libraries • Once clones 
corresponding to the novel genes have been identified they 
can be expressed for functional characterisation of the 
protein. 

30 

4.1 Identification of homologs by bioinf ormatics 
Analysis of the 2031 OR protein sequence with PFAM 
(http: //www. Sanger .ac.uk/Software/Pfam/) identified this 
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as a member of the Oxidored FMN family (PF00724) , E-value 
3.6e-57. This includes the well-characterised ^^Old Yellow 
Enzyme'' proteins of S, cere^risiae and other fungi. 

5 Homologs of A. fumigatus 2031 OR sequence were identified 
by database searches (see Table III) . Where necessary, 
matching contigs were down-loaded and genes predicted from 
genomic DNA by Genscan analysis, blast searches, alignment 
and visualisation with Artemis as described in Example 2. 
10 Protein and nucleotide multiple alignments were generated 
for 2031 OR and related genes (Figures 1 and 2) . 

Protein and nucleic acid multiple alignments are generated 
by means of programs such as ClustalX (Thompson et al., 

15 1994, Nucleic Acids Research, 22, 4673-4680; Thompson et 
al., 1997, Nucleic Acids Research, 24, 4876-4882;) and/or 
using manual alignment editors such as Align 
(http: //www. gwdg.de/~dhepper/ download/; Hepperle, D. , 
2001: Multicolor Sequence Alignment Editor. Institute of 

20 Freshwater Ecology and Inland Fisheries, 16775 Stechlin, 
Germany) . 

Table III: 2031 homologs identified by database searches 

25 



Contig/EST/ 
predicted ' 

gene 


E-value"^ 


SEQ ID No. 


Species (details of search 
given in footnotes) 


EST/gDNA 


CDNA^ 


Protein 


4929 


6.6e-81 


4 


5 


6 


AspGrg-j-llus fumigatus'^ 


4951 


l,le-68 


7 




8 


Aspergillus fumigatus'^ 


4875 


5.7e-13 








Aspergillus fumigatus^ 


4961 


3.2e-10 








Aspergillus fumigatus'* 


1.112 


3e-33 


9 




10 


Aspergillus nidulans'^ 


6-2431 


2.6e-77 


11 




12 


Candida albicans^ 


6-2464 


5.9e-50 


13 




14 


Candida albicans^ 
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6-2460 


5.8e-19 








Candida albicans^ 


A36990 


le-15 


- 


_ 


- 


Candida albicans^ 


NCU07452 .1 


7e-94 


15 


_ 


16 


Neurospora crassa' 


NCU08900.1 


2e-19 


17 


18 


19 


Neurospora crassa' 


NCU04452 .1 


2e-23 


_ 




— 


Neurospora crassa' 


MG04569.3 


le-106 


20 


21 


22 


Magnaporthe grisea^ 


M603823.3 


8e-19 


43 




44 


Magnaporthe grisea** 


NP_595868 


le-05 


23 


- 


24 


Schizosaccharomyces pombe^ 


OYEl 


1 <a — 1 ^ 

xe xo 










oyE2 


4.5e-19 


- 


- 


- 


Saccharomyces cerevisiae^ 


oyE3 


1. Oe-16 








Saccharomyces cerevisiae 


FsCon[0063] 
{EST contig) 


le-82 


28 


29 


30 


Fusarium 

sporotrichioides^° 


GZ15771741 


5e-76 


36 


37 


38 


Fusarium graminearum^" 


MgC02ai] 
(EST contig) 


2e-67 


39 




40 


Mycosphaerella 

graminicola-^^ 


CtCon[0249] 
(EST contig) 


le-55 


25 


26 


27 


CoXletotrichium trifolii'-'' 


FsCori[0458] 
(EST contig) 


16~42 


34 




35 


Fusarium 

sporotrichioides 


FsCon[0237] 
(EST contig) 


le-40 


31 


32 


33 


Fusarium 

J. • i_ • • -J 10 

sporotrzchzoxdes 


Mga0328f 


3e-35 


41 




42 


Mycosphaerella 
graminicola^^ 


T44612 


le-52 








Pseudomonas putida'^''' 


NP_625402 


le-79 








Streptomyces coelicolor''"'' 


NP_2 95913 


le-78 








Deinococcus radiodurans^^ 


AF320254 


5e-55 








Deino coccus radiodurans^"^ 


FG00074.1 




82 


82 


83 


■Fusarium graminearum^^ 


Contig 1.2 


le-71 


84 


84 


85 


Ustilago maydis^"^ 



^E-values for blast scores refer to searches with 2031 OR 
protein unlesss pecified otherwise in footnotes. 

cDNA was generated in cases where either the gene 
contains multiple exons, or there are probable frame-shift 
errors from sequencing of the EST^ or the EST given is the 
non-coding strand, 

^Search of the A. fumigatus genome at http://www.TIGR.org 
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(tblastn) with NP_5958 68. 

^Search of A. nidulans genome held on local machine 
(tblastn) . 

^Search of the C. albicans genome at http://www- 
5 sequence . Stanford . edu/group/candida/ (blastp) . 

^Search of the non-redundant protein sequence database 
(nr) at http://blast.genome.ad.jp (blastp). 

"^Search of the N. crassa predicted proteins at 
http : //www . broad .mit . edu/annotation/f ungi/neurospora/ 
10 (blastp) . 

^Search of the • M. grisea predicted proteins at 
http : //www. broad. mit .edu/annotation/fungi/magnaporthe/ 
(blastp) . 

^Search of S. cerevislae orf proteins 

15 (http : //mips . gsf . de/cgi-bin/blast/blast_page?genus=yeast) 

^°Search of COGEME pathogenic fungal EST database at 
http://cogeme.ex.ac.uk/blast.html (tblastn, max E~ 
val=0 . 1) . 

•^-^Search of NCBI non-redundant protein database on local 
20 machine with SEQ ID No. 1 (blastx) .• Only a selected set of 
hits against bacterial proteins are shown. 

"^^Search of F. graminearum predicted proteins held on 
local machine (blastp-) . 

^^Search of U. maydis contigs held on local machine 
25 (tblastn) . 

To clarify the relationships between the 2031 OR, OYE and 
the hits identified from blast searches, phylogenetic 
analysis was carried out. The PHYLIP suite of programs was 
30 used (Pelsenstein, Felsenstein, J., 2002. PHYLIP 
(Phylogeny Inference Package) version 3.6a3. Distributed 
by the author. Department of Genome Sciences, University 
of Washington, Seattle) . The multiple alignment used for 
the analyses was essentially that given in Figure 1 with 
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partial sequences, gapped regions and tmreliably aligned 
sections excluded- A distance matrix was generated using 
PROTDIST with the Jones -Taylor- Thornton model and the tree 
inferred using FITCH with global rearrangements and 10 
3 jumbles of input order. 10 0 bootstrap replicates were 
generated using SEQBOOT, distance matrices generated using 
PROTDIST as above, trees inferred using NEIGHBOUR, and 
then bootstrap values and the consensus tree were 
calculated using CONS ENS E . Trees were viewed using 
10 TREEVIEW (Page, 1996 Page, R, D. M-, 1996. TREEVIEW: An 
application to display phylogenetic trees on personal 
computers. Computer Applications in the Biosciences 12, 
357-358.) 

15 Phylogenetic analysis identified a clade supported by good 
bootstrap values, which included A. fumigatus 2031 OR and 
other enzymes. This could be distinguished from a clade 

; containing OYE enzymes which was also supported by good 
bootstrap values. Bacterial homologs of both 2031 OR and 

20 OYE (not shown) were also identified. We have therefore 
identified a set of 2 031 OR homologs which, surprisingly, 
is distinct from the well-characterised OYE family, and 
which, by virtue of the essentiality demonstrated for A. 
fumigatus 2031 OR, represents a set of potential targets 

25 for anti-fungal drugs 

4.2 Ideiitif ication of homologs by degenerate PGR 
4.2.1. Preparation of genomic DNA from organism of 
30 interest 

Fungal cultures are prepared using methods suitable for 
particular species. For example, Aspergillus and Candida 
species, Cryptococcus neoformans ^ Fusarium solani and 
Trichophyton species are maintained on Sabouraud dextrose 
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agar at 30-35*^C; Leptosphaeria nodorum on Malt agar medium 
(30 g/L malt extract; 15 g/L Bacto-agar, , pH 5-5), 24.0°C; 
Magnaporthe grisea on Oatmeal agar (6.1 g/L agar, 53.3 g/L 
instant oatmeal) 25.0°C, or Cornmeal agar (Difco 0386), 
5 26,0 C; Phytophthora capsici cultures were maintained on 
on V-8 agar at 24°C; Pyricularia oryzae cultures were 
maintained on rice polish agar at 24°C under white 
fluorescent lights (12hr artificial day) , and were 
subcultured every 7-14 days by the transfer of mycelial 

10 plugs to fresh plates; Pythium ultimum cultures were 
maintained on PDA at 24°C, and subcultured every 7 days by 
the transfer of aerial mycelium to fresh plates with an 
inoculating needle; Rhizoctonia solani cultures were 
maintained on PDA at 24°C under fluorescent lights (12 h 

15 artificial day) , and subcultured every 7 days by the 
transfer of mycelial plugs to fresh plates; Ustllago 
maydis cultures were maintained on PDY agar at 30°C in the 
dark, and subcultured by re-streaking. 

20 .Genomic DNA was prepared from cultures using standard 
methodologies, e.g. using the Qiagen DNeasy Plant Kit, or 
using methods described in Example 2.2. 

4.2.2 PCR 

25 Primers (SEQ ID Nos . 53 and 54) were designed on the - 
specific regions given as regions 2 and 6 in Figure 2 . 
However, those skilled in the art will appreciate that it 
may be necessary to try alternative primers. PCR reactions 
using the above primer pair are set up as follows: 

30 

12.5 ]a1 2x ReddyMix PCR mastermix (ABIgene) 
1 \il primer SEQ ID No. 53 (5 pmol) 
1 ]il primer SEQ ID No. 54 (5 pmol) 
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template gDNA (1.5-4 )ag/ml) 

nuclease-f ree water to give a final volume of 25 |li1 

The reactions are run using the following conditions on a 
Biometra personal PGR cycler (Thistle Scientific Ltd, DFDS 
House, Goldie Road, Uddington, Glasgow, G71 6NZ) 



Stepl 95°C 5min 

Step2 95°C Imin 

10 Steps' 53°C Imin 30sec 

Step4 68°C 2min 30sec 

Step5 72''C lOmin 

Step6 4°C Hold 

15 30 cycles of steps 2-4 were carried out. The PGR products 
are purified (to remove residual enzymes and nucleotides) 
using Qiagen's QIAquick PGR Purification Kit (Qiagen Ltd, 
Boundary Court, Gatwick Road, Grawley, West Sussex, RHIO 
9AX, UK) according to the manufacturers instructions and 

20 eluted into 40|il of sterile water (BDH molecular biology 
grade/filter sterile) • The purified PGR products are 
examined on 1% agarose gels. 



Those skilled in the art will appreciate that degenerate 
25 PGR may require variations in a nximber of parameters in 
the attempts to generate a product. These include primer 
concentration, template concentration, concentration of 
Mg^**" ions, elongation and annealing times, and annealing 
temperature. Variations in temperature can be accomodated 
30 by the use of a gradient PGR machine. 

The purified PGR products are cloned into pPEM-Teasy 
(Promega) and then transformed into XLIO-Gold® Kan 



63 



ultracompetent E, coll cells according to the 
manufacturers instructions • The trans foimation reactions 
are then plated onto LB agar plates containing ampicillin 
(100 lag/ml), 50 iil X-gal (4%) and 10 pi IPTG (100 mM) , 
5 Following overnight incubation at 37°C, individual white 
colonies from each transformation are sub-cultured into LB 
broth containing ampicillin (100 |ig/ml) . After overnight 
incubation at 37°C with shaking, plasmids are extracted 
using Qiagen spin mini plasmid extraction kits according 
10 to the manufacturers instructions and sent away for full- 
length sequencing . 

4.3 Identification of homologs by Southern Blotting 

15 

4.3.1 Digestion of genomic DNA and transfer to nylon 
membranes 

Genomic DNA from the fungi of interest are digested with 
the appropriate restriction enzyme and run on 0.8 % 
20 agarose gel. The gel is then submerg^ed in 250 mM HCl for 
no more than 10- mins, with shaking, at room temperature, 
after which the gel is rinsed with sterilised RO water - 

Transfer of the DNA onto nylon membrane is carried out 
25 using 0.4 M NaOH. Transfer protocols and apparatus are 

well known and are described in e.g. Sambrook et al., 
(1989), Molecular Cloning-, 2"^*^ Edition.,' Cold Spring 

Harbor Laboratory Press, After transfer^ the DNA is fixed 

to the membrane by baking at 12 0 °C for 3 0 min . The 
30 membrane can then be used immediately, or stored dry for 

future use . 

4.3.2. Preparation of probe 
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Probes are generated either by restriction digests of DNA 
or by PGR of an appropriate region. A suitable probe can 
be generated by PGR using the primer pair SEQ ID Nos - 53 
and 54, A. fumigatus genomic DNA, and the methods give in 
5 4,2.2. 

1 ]ag DNA template is diluted in molecular biology water to 
a total volume of 16 lal, denatured in a boiling water bath 
for 10 mins, and quickly chilled on ice. 4 \il DIG-High 

10 Prime (1 mM dATP, ImM dGTP, ImM dGTP, 0.65'mM dTTP, 0.35 
mM alkali-labile-digoxygenin-ll-dUTP, 1 labelling, 
grade Klenow enzyme, 5 x reaction buffer, in 50% (v/v) 
glycerol) is then added and the reaction incubated at 37 °C 
for 20 hours, after which 2 lal of 200 mM EDTA pH 8.0 is 

15 added to terminate the labelling reaction- The labelling 
efficiency is estimated by comparison with DIG-labelled 
control DNA. 

4 . 3 . 3 . Prehybr±d±sat±on and Hybridisation 

20 The membrane is placed in a hybridisation tube containing 
20 ml of prehybridisation solution (DIG Easy Hyb, Roche) 
per lOOcm^ of membrane surface area and prehybridised at 
42°C for 2 hours in a hybridisation oven. The DIG- 
labelled probe is denatured by heating in a boiling water 

25 bath for 10 min and then chilled directly on ice. The 
probe is then diluted to -200 ng/mL in hybridisation 
solution (Easy Hyb, Roche; at least 5 mL of hybridisation 
solution is required per hybridisation) . The 
prehybridisation solution is discarded from the 

30 hybridization tube and the hybridisation solution 
containing the DIG-labelled probe added quickly. The 
hybridisation then proceeds overnight at a 42'^C in the 
hybridisation oven. The optimum temperature is dependant 



65 



on probe size and homology with target sequence and was 
determined empirically. 

After hybridisation^ the membrane is washed twice at 42 °C, 
5 5 mins per wash, with 50 mL of stringency wash solution (3 
X SSC/O.1% SDS; where 20 x BSC buffer is 3 M NaCL, 300raM 
sodium citrate, pH 7.0), followed by two washes at RT, 15 
min per wash, in 50 mL stringency wash solution. The 
stringency of these washes can be decreased by increasing 
10 the SSC concentration to 6 x SSC, 0.1% SDS and/or 
decreasing the wash temperatures. 

4-3.4. Detection 

The membrane is washed in 2 0 mL washing buffer (lOOmM 

15 Maleic acid, 150 mM NaCl; pH 7.5;0.3% v/v Tween 20), and 
then incubated successively with the following; 20 mL 
blocking solution (1 % w/v blocking reagent for nucleic 
acid hybridisation, Roche, dissolved in lOOmM maleic acid, 
150 mM NaCl, pH 7), for 30 min at room temperature; Anti- 

20 DIG-alkaline phosphatase (Roche) diluted 1:5,000 in 
blocking buffer, 3 0 min at room temperature; Washing 
buffer, two washes each of 15 min at room temperature; 
Detection -buffer (lOOmM Tris-Hcl, 100 mM NaCl; pH 9.5), 2 
min at room temperature. The membrane is then removed, 

25 placed on top of an acetate sheet, and 0.5 ml (per 
lOOcm^) of CSPD or CDP-star added to the top of the 
membrane. A second sheet of acetate is then placed over 
the surface of the membrane, the assembly incubated for 5 
min at room temperature and then sealed in a plastic bag. 

30 The assembly is then exposed to X-ray film for between 15 
min and 1 hour. Optimal exposure time is determined 
empirically by increasing exposure time up to 24 hours. 
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The presence of a band on the gel is' evidence of a gene in 
the genomic DNA of interest. The molecular weight of the 
band depends on the size of the restriction fragment that 
contains the gene . 

5 

Example 5. Expression during infection of wax moth' larvae 
{Galleria melonella) and mice infected with A, fumlgatus 

10 5 . 1 Preparation of cDNA from infected wax-moth larvae 

Wax moth larvae have been shown to be good model systems 
in which to study Candida infection (Cotter et al., 2000,. 
FEMS Immunol Med Microbiol 21, 163-9; Brennan et al., 
2002, FEMS Immunol Med Microbiol 34, 153-7) . We have found 

15 that this insect system is also a good system in which to 
study Aspergillus infection (D. Law and J. Rooke, 
manuscript in preparation) . 

5,1.1 Growth and infection of wax-moth larvae 
20 Spores of A, fumlgatus {AF293), grown on Sabaraud Dextrose 

agar, were harvested and re-suspended in PBS/Tween 80. 

Spores were washed and the concentration adjusted such 

that a 10 \xl inoculum will cause death in 90% of the test 

group 3-4 days after infection (for AF293 this is 5.0- 
25 7 . 0x10^ cfu/ml) . Inoculum concentration was estimated 

using an improved Neubauer haemocytometer counting chamber 

and confirmed by TVC eniomeration . 

Wax moth larvae were purchased from Livefood UK, Somerset, 
30 UK (www.livefood.co.uk), and were maintained in the dark 
at room temperature in wood shavings prior to infection. 
Healthy larvae (2 50 mg +/- 5 0 mg) were selected and 
incubated at 4°C for 10 minutes immediately prior to 
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infection to immobilise them. Larvae were then injected 
through the cuticle of the left last pro-leg with 10 |li1 
spore suspension (lOOx stock) r using a sterile Hamilton 
syringe. Larvae were then .transferred to a sterile Petri 
5 dish. The following controls were also established: Larvae 
injected with 10 jal PBS/Tween only; larvae injected with 
10 iLLl heat killed spores (killed by incubation for 20 min 
10p°C) ; larvae pierced but not injected; and untouched 
larva.e. Larvae were incubated at 30°C and monitored at 
10 least twice daily. All treatments and controls were 
carried out on batches of 10 larvae. Larval deaths and 
general health condition was recorded every 24 hrs and 
dead or moribund larvae were removed from the test group - 

15 5-1.2 Fr&parat±on . of DNA-free RNA from Aspergillus 
fumlgatus- infected wax moth larvae (Galleria melonella) . 
cDNA was prepared from the following sources: Uninfected 
larvae; larvae after 48h infection with A. fumlgatus 
(early infection) ; larvae after 72h infection with A. 

20 fumlgatus (late infection) ; larvae infected with heat- 
killed A. fumlgatus spores; and A. fumlgatus grown in 
Sabaraud Dextrose agar broth for 16hr. 

Frozen larvae were ground to a fine powder under liquid 
25 nitrogen in a mortar and pestle previously baked at 22°C 
overnight/ treated with RNaseZAP, rinsed with DEPC-treated 
water (0.1% (v/v) DEPC, stirred for Ih and autoclaved for 
Ih) and cooled with liquid nitrogen. Ground sample was 
transferred to Eppendorf tubes (no more than 50 mg per 
30 tube) and total RNA extracted using the Qiagen RNeasy 
Plant Mini Kit following the protocol for isolation of 
total RNA from filamentous fungi in the RNeasy Mini 
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Handbook (06/2001, Pages 75-78, 

http: / /www.qiagen. coia/literature/handbooks/ 
ma / rnamini / 1 0 1 62 7 2HBRNY_0 62 00 IWW . pdf ) , 

5 The following modifications were used: At step 3, 600 ]il 
RLT was added to each 50 rag tissue and vortexed; At step 
4, samples were centrifuged for 3 min •'at maximum speed; At 
step 6f all samples from the same tissues were applied to 
the same RNeasy column; At step 7, RNeasy column was 

10 incubated for 5 min at room temperature after addition of 
RWl; Optional step 9a was carried out twice; At step 10, 
30 ]il RNase-free water was added, samples incubated for 10 
min at room temperature, and then centrifuged for 1 min at 
14,000 RPM; At step 11, the elution step was repeated to 

15 give a total volume of 60 ]il RNA. A sample of the RNA was 
run on a 1.5% agarose gel and the amount of RNA quantified 
using the molecular marker, RNA was then stored at -80 "C, 

A portion of the RNA was Dnase treated using 2 pi RNase- 
20 free DNase (Promega) per ]ig RNA, in the presence of lOX 
DNase buffer (Promega) at 37 °C for 4h. The RNA was then 
cleaned up using the Qiagen RNeasy Plant Mini Kit 
following the RNeasy Mini Protocol for RNA Cleanup (RNeasy 
Mini Handbook 06/2G01, pages 79^81), but including a 
25 further DNase treatment step during clean-up as in the 
Rneasy handbook. 

The following modifications were made: Optional step 5a 
was carried out; At step 6, 30]al RNase-free water was 
30 added, samples incubated f or ' 10 min at room temperature 
and then centrifuged for 1 min at 14, 000 RPM; At step 7, 
the eluate from step 6 was transferred onto the RNeasy 
column, incubated for 10 min at room temperature, and then 
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centrifuged for 1 itiin at 14,000 RPM. A sample of the 
DNase-treated RNA was run on an agarose gel, quantified 
and stored at -80 "C. 

5.1.3 Checking RNA samples for DNA contamination 

5 To verify the absence of genomic DNA from the RNA samples, 
PGR was carried out using primers that amplify the P- 
tufoulin gene (SEQ ID Nos . 77 and 78). In the absence of a 
reverse-transcription step, only gDNA will be detected and 
thus any gDNA contamination will be revealed. The 
10 following reaction mixture was set up: 

12-5 2x ReddyMix PGR mastermix (ABIgene) 
1 ]il each primer (5 pmol) 
template gDNA (1.5-4 \ig /ml) 
•15 nuclease-f ree water to give a final volume of 25 pil 

The reactions were run using the following conditions on a 
Biometra personal PGR* cycler (Thistle Scientific Ltd, DFDS 
House, Goldie Road, Uddington, Glasgow, G71 6NZ) 

95°G 5min 

90°G Imin 

51°G Imin 

68°C Imin 

68°G lOmin 

4°G ' Hold 

40 cycles steps 2-4 

If a PGR product was observed, genomic DNA was present 
and the sample was DNase-treated again. If the PGR was 
30 negative, no DNA was present in the sample. 



Stepl 
Step2 
Step3 
Step4 
25 Steps 
Step6 
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5.1.4 Prepara tlon • of .cDNA 

300 ]xg DNA-free EINA and 3 ]il oligo (dT) ' (100 ng/]il) were 
added to an RNase-free 0.5 ml microcentrifuge tube, and 
made up a total volume of 42 ]il with DEPC-treated water. 
5 Samples were mixed and incubated in a heat block at 65 °C 
for 5 min and then slowly cooled to room temperature . 2 ]xl 
Ultrapure dNTPs (10 mM each, Clontech) , 1 |j1 stratascript 
reverse transcriptase (Stratagene) and 5 \il lOX reverse 
transcriptase reaction buffer were then added- The samples 
10 were incubated at 42°C for Ih,- denatured at 90°C for 5 min 
and then cooled on ice. Samples were dispensed in 5-10 ]il 
aliquots and stored at -2 0°C. 

5.2. Preparation of cDNA from infected mice 
15 5,1.1 Infection of mice with A. fumigatus and extraction 
of tissues. 

Mice were infected with Aspergillus fumigatus and organs 
harvested as follows. Thirteen male GDI mice were injected 
with the immunosuppressant cyclophosphamide (0.025 g/ml; 

20 200 mg/kg) IV via the tail vein. After 72 hours, twelve 
mice were injected with 0.15 ml Aspergillus fumigatus 
.AF293 conidia (7.5 x 10^/ml) . 11 hours after infection, 
four mice were sacrificed with an overdose of inhaled 
halothane. The brain, lungs, liver and kidney were 

25 removed, frozen by immersion in liquid nitrogen, and 
stored at -7 0°C. A further four mice were also sacrificed 
at 24 and 48 hours after infection. 

RNA was prepared from mouse tissues as described for wax 
30 moth larvae above (5.1.2 and 5.1,3). 

5.2.2 Preparation of cDNA from DNA-free RNA. 
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cDNA was prepared from DNA-free RNA using the Pr omega 
Reverse Transcription kit, following the protocol as 
supplied with the product (Technical Bulletin No. 099, 
http://www.promega.com /tbs/tb099/tb099 . pdf ) . In a 
5 modification to the protocol, the cDNA synthesis reaction 
was incubated for 60 min at 42 °C rather than for the 
suggested 15 min. Samples were stored in 5-10ijl1 aliquots 
at -20°C. 

10 5.3 Design and optimisation of primers 

Primers were designed against the 2 031 OR cDNA sequence 
using Beacon Designer 2.1 (Premier Biosoft, 

ht tp : / / www . pr emierbiosof t . com ) with the following 
parameters; Target Tm = 58 ± S'^C; Length of primers = IS- 
IS 24; Amplicon length = 75-150 bp,. All other settings were 
default . Care was taken to choose primers that would not 
form dimers. or other secondary structures . Secondary 
structures of amplicons were calculated using mf old 
( http : //www.bioinf o.rpi . edu/applications/mf old/old/dna/f or 
20 ml . cgi ) and primer sets giving an amplicon with • little or 
no secondary structure were chosen. The resulting primers 
are given as SEQ ID Nos . 7 9 and 80. 

To determine optimum annealing temp for the primer set, a 
25 gradient PGR was run on an Icycler PGR machine (Biorad) , 
using A, fumigatus AF293 genomic DNA as a template and the 
following .reaction mixture: 

112. 5ul Abgene PGR Reddymix 

30 9ul SEQ ID No. 79; OXRED 2031F6 (5 pm/p-l) 

9ul SEQ ID No. 80; OXRED 2031R5 (5 pm/^ll) 
85.5ul H20 

9ul AF293 gDNA (lOng/ul) 
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For the negative control, the gDNA was omitted and the 
ainoimt of water increased correspondingly. 

For each mix, 25 |Lil was pipetted into 8 wells on a 
5 multiwell plate, and each well run at a different temp 
(between 50 and 65°C) with the following conditions: 

Stepl . 95°C - 5 min 

Step2 . 95®C - 1 min 

10 Step3 . Gradient 50-65*^0 - 1.5 min 

Step4. 72''C - 1 min 

Steps. 72°C - 10 min 

- Step6. 8°C - hold 

15 Steps 2-4 were run for 3 0 cycles 

The PGR products were run on a 2% agarose gel. A single 
band of the correct size of 14 8 bp was seen on the gel for 
all the temperatures, and the optimum was found to be 

20 es^'c. 

5.4 Testing species-specificity of primers 

The real-time primers designed above were further tested 
to ensure that mouse nucleic acid was not amplified using 
25 these primers. Four reactions were set up, each containing 
the following: 

12 . 5 |Lil Abgene Reddymix 

1 jLil primer SEQ ID No. 79 
30 1 111 primer SEQ ID No. 8 0 

9.5 1X1 H20 

and either; 1 |al infected mouse kidney cDNA (50 ng/ul; 
experimental) ; 1 |li1 uninfected mouse kidney cDNA (50 
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ng/ul; uninfected control); 1 ]il AF293 gDNA (10 ng/p.1; 
positive control) ; 1 p,l water (negative control) , 

The following PGR settings were used: 
5 Stepl 95°C - 5 min 
Step2 95°C - 1 min 
Steps 63^C - 1.5 min 
'Step4 72°C - 1 min 
Steps 72° G - 10 min 

10 Step6 B'^G - hold 

Steps 2-4 were run 40 times 

The PGR products were run on a 2% agarose gel. A, 
fumigatus genomic DNA gave a band of 14 8 bp, the expected 
15 size, but no bands were seen in uninfected or infected 
mouse cDNA. These primers therefore appeared to be 
specific . 

5.5 Real-time PGR to detect expression in infected larvae 

20 

PGR reactions were set up using the Biorad iQ SYBR green 
supermix as follows: 

14 iLil Primer SEQ ID No. 79 
25 14 |Lil Primer SEQ ID No. 80 
175 |Lil SYBR mix 
133 ILLl H20 

Four reactions were set up containing 72 |li1 of the above 
30 mix and either; 3 |li1 H2O; 3 |li1 uninfected larvae cDNA (5 0 
ng/p.1) ; 3 |al AF2 93 gDNA (5 ng/|Lil) ; or 3 \xl infected larvae 
cDNA (50 ng/jj-l) were added. 3 x 25 |li1 aliquots of each 
reaction were aliqxioted into an Abgene multiwell plate. 
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the plate sealed with optical sealing tape (Biorad) , then 
placed in a Biorad Icycler real-time PGR machine. 
Reactions were run with the following conditions: 

5 Stepl. 95.0''C 3 min 

Step2.- SB-C'C 30 sec 

Step3. 63.0''C 3 0 sec 

Data collection and real-time analysis enabled. 
Step4. 72.0°C 15 sec 

10 60 cycles of steps 2-4. 

Steps. 95.0*C 30 sec 

Steps. 50.0''C 30 sec 

Step7. 50.0°C 10 sec 

90 cycles of step 7 with setpolnt tempejratuz-e Increased hy 
15 0.5°C after each cycle starting- with cycle 2. 

Melt curve data collection and analysis enabled. 

Results are shown in Tables IV and V. Expression of 2031 
OR was demonstrated in both Af293 cDNA (Ct = 25.8) and in 
20 infected larvae (Ct = 32.3). Therefore, the message is 
expressed both in A. fumig-atus cultures and in A. 
fumigatus from infected larvae. The negative and 
uninfected larvae controls give only primer dimers and 
non-specific products. 

25 

Table IV. PGR Quantification Spreadsheet Data for SYBR-490 



wexx 


Xdentitier 


ct 


COB 


mrectea larvae ibung; 




coy 


inreccea j.arvae ibung; 


22. A 


CIO 


mtectea larvae ^bung; 


31.4 


D03 


Negative 


bl.3 


D04 


Negative 


N/A 


DOb 


Negative 


bb .6 


H03 


unintectea larvae 


36.4 
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H04 


unxnreccea larvae 


isl/A 


HOb 


unmreccea xarvae 


n/a 


HOB 


A. fumig-atus gDNA {5ng) 


2b. B 


HO 9 


A. fumig-atus gDNA {5ng) 


26 


HIO 


A. fumig-atus gDNA (Sng) 


2b .B 



Data Analysis Parameters: Calculated threshold was 
replaced by the user selected threshold 7.4.; User 
selected baseline cycles were 2 to 10. 



.Table V. Melt Curve Analysis Spreadsheet. Data for SYBR-49Q 



Well 


weJ-J. laencirier 


Jr'eaK ID 


Meic Temp 


C8 


intectea larvae Cbung; 


C8 . 1 


bb . b 


C9 


mtectea larvae ibOngj 


cy . 1 


88. 5 


CIO 


xnteccea larvae ^bong; 


CJIO . 1 


bb .b 


D3 


JMegative 


,1 


vy 


Db 


negative 


Db.l 


\B1 . b 






Db .iJ 


77 -b 


H3 


uninrecnea larvae 


HJ.l 


Bl.O 


H5 


umntectea larvae 


Hb.l 


7B.0 


HB 


A. fumigatus gDNA (Sng) 


JtlB .1 


89.0 


H9 


A. fumigatus gDNA (5ng) 


H9.1 


89.0 


HIO 


A. fumigatus gDNA (5ng) 


HIO . 1 


89.0 



Melt Curve Analysis Parameters; Threshold for automatic 
10 peak detection was set at 2.64. 

5.6 Real-time PCR to detect expression in infected mouse 
kidney cDNA. 

15 

Real-time experiments similar to those described in 5.5 
using 1 |li1 of infected mouse cDNA showed no amplification 
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(data not shown) , The experiment was therefore carried out 
using an increased amount of infected mouse cDNA with the 
following conditions: 

5 18 (xl Primer SEQ ID No. 79 . 

18 |ll1 Primer SEQ ID No. 80 
225ul SYBR mix 
99ul H20 

10 Four reactions were set up containing 60 jlxI of the above 
mix and either; 15 |-l1 H2O; 3ul uninfected mouse kidney (5 0 
ng/|LLl) + 12 |xl H20; 15 jLil infected mouse kidney - 48h 
post-infection (50ng/ul) ; or 3 |Ltl AF293 cDNA (5ng/|al) + 12 
|Lil H2O were added. 3 x 25 (J.1 aliquots of each reaction 

15 were aliquoted into an Abgene multiwell plate,, the plate 
sealed with optical sealing tape (Biorad) , then placed in 
a Biorad Icycler real-time PGR machine. Reactions were run 
with the following conditions: 

20 Stepl. 95.0°C 3 min 

Step2. 95.0'^C for 30 sec 

Step3. 63.0°C for 30 sec 

Data collection and real-time analysis enabled. 
Step4. 72.0°C for 15 sec 

25 60 cycles of steps 2-4. 

Step5. 95.0°C for 3 0 sec 

Step6. 50.0°G for 30 sec 

Step7. 50.0°C for 10 sec 

90 cycles of step 7 with setpolnt temperature Increased by 

30 0.5°C after each cycle starting with cycle 2. 

Melt curve data collection and analysis enabled. 
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Expression of A. fumigatus AF293 2031 OR was seen in cDNA 
(Ct = 28.8) but only in 2 of the 3 infected mouse kidney 
reactions (Ct values = 34.4, 41.2) (Tables VI and VII). 
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The product in the other infected kidney cDNA reaction 
(well A12) was a primer dimer or a non-specific product 
(Tm = Sl^'C on .the melt curve) , whereas the correct 2031 OR 

product has a Tm of 88.5°C (Tables VI and VII) - The 
5 negative and iininfected kidney controls gave only primer 

dimers and non-specific products. 



Table VI; PGR Quantification Data for SYBR-49Q 



Weil 


laencirxer 


ct 


AlO 


xntectea Kiianey (2b0ng) 


34 .4 


All 


mtectea Kioney (,25Ung) 


4X .2 


A12 


xntectea Kxctney t2bong; 


38 


D02 


negatxve 


bU . 'J 


D03 


negat ive 


b4.6 


D04 


negat xve 


46-2 


H02 


unxnreccea Kxoney 


b2 , b 


H03 


unxnrecnea Jcxoney 


b4 


H04 


uninrecnea Kxoney 


Sl.B 


HIO 


AF2 93 ^bng; 


28,7 


Hll 


AJb'293 (bng) 


28. 7 


ti'X'A 


AF2 93 (bng) 


30 



Calculated threshold was replaced by the user selected 
threshold 5.4. User selected baseline cycles were 2 to 10. 
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Table VII , Melt Curve Analysis Spreadsheet Data for SYBR- 
490 



well 


wexx xaentxtxer 


peaic LU 


Mexc Temp 


AlU 


xnreccea Kxaney (2bu ng; 


AlO.l 


BB .b 


All 


xnreccea Kxciney {'AbO ngj 


Ali.l 


88. b 


A12 


xnreccea Kxctney ;2bU ng; 


A12.X 


81. 0 


D2 


JMegatxve 


D2-1 


79,0 
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D3 


Negative 


D3 . 1 


'/ a . 0 


D4 


Negat X ve 


D4 • 1 


78.0 


H2 


imintectea Kxcmey 


H2.1 


78. b 


H3 


unmtectea Kxcmey 


. 1 


77. b 


H4 


unxntectea Kxoney 


H4.1 


yu . b 


HIO 


AF293 (5ng) 


HlO.l 


88 .b 


Hll 


AP293 (bng; 


Hll.l 


88 , b 


H12 


AF293 ISng) 


H12-1 


88. b 



Threshold for automatic peak detection was set at 2.09. 



5 A. jfumigratus 2031 OR is therefore clearly expressed during 
infection of wax moth larvae. 2031 OR is only expressed at. 
a very low level during infection of mouse kidney, since 
increased amounts of template had to be used to give a 
signal. The expression during • infection suggests that the 
^10 gene product may be a suitable target for an ant i -fungal 
drug , 

Example 6. Expression of recombinant 2031 OR and/or 
fragments 

15 

Recombinant proteins or fragments were expressed to enable 
detailed study of function and as the starting point for 
the development of a high-throughput screen for inhibitory 
■ compounds - 

20 

6.1 Production of cDNA constructs 

PGR was carried out using cDNA prepared as described e 
above to generate polynucleotides encoding 2031 OR 
sequence essentially corresponding to SEQ ID No. 3. 



79 



PGR reactions were carried out using the following reaction 
mixture and conditions . All Reagents were present in the KOD 
kit (Novagen) . 

5 2-5 ]il lOx PGR Buffer"' 
5 ]il dNTPs {2mM) 
2 ]il MgS04 (25mMj 

1 ]il primer A (5 pmol) (SEQ ID No. 55; SL_OxXa30F5) 
1 ]il primer B (5 pmol) (SEQ ID No. 56; SL-OxXa30R7) 
10 1 ]il template cDNA 

11.5 pi nuclease-free water 
1 p.1 KOD Polymerase 
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PGR reactions were run using the following conditions :- 



Stepl 94°C 5 min 

Step2 94''G 1 min 

Step3 59.3°G 1 min 

Step4 68°C 1 min 30sec 

20 Step5 68°G 10 min 

Step6 IC^G Hold 



40 cycles of steps 2-4 were carried out and the PGR products 
were purified using Qiagen' s QIAquick PGR Purification Kit 
25 (Qiagen Ltd, Boundary Gourt, Gatwick Road, Grawley, West 
Sussex, RHIO 9AX, UK) according to the manufacturers 
instructions . The purified PGR products were examined on 
agarose gels. 

30 cDNA fragments were then cloned in to the pET30 Xa/LIG vector 
(Novagen) , transformed into Nova Blue chemically competent E. 
coll cells, and plated on to a prewarmed kanamycin ( + ) 
selection plate. After an overnight incubation at 37° G, 
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kanamycin-resistant colonies were selected and grown up in 
kanainycin containing LB medium. Plasmid DNA was isolated 
using the Plasmid Mini Kit (Qiagen) . Confirmation of the 
presence and correct orientation of the inserts was 
5 determined by restriction analysis and sequencing of the 
construct . 

Purified plasmid DNA, which had been confirmed to be of the 
correct sequence and orientation, was transformed into 
10 chemically competent BL21 Star (DE3) One Shot E. call cells 
and grown overnight at C. 2 ml of an over-night culture 

were used to innoculate 100 ml of LB, 30 |Lig/ml kanamycin, and 
the cultures incubated at 37' C, 220 rpm until the cell 
density reached an optical density of 0,6 (approximately 3 
15 hours) . Expression of the recombinant protein was then 
induced with IPTG (ImM) for 5 hours. 

Bacteria were harvested by centrif ugation at 4500 rpm for 
10 minutes and the pellets lysed in lysis buffer (10 ml 
20 Bugbuster (Novagen) , 10 \xl Benzonase (Novagen) , 0.4 |j,1 

lysozyme (Novagen) and 100' |ll1 1M imadazole for 2 0 minutes 
at room temperature. Cells were then spun down at 150 00g 
for 20' at 4° C and the supernatant, containing soluble 
recombinant protein, removed to a clean tube . 

25 

Supernatant was added to prewashed Ni-Nta resin at a 
concentration of 5-10- mg protein per ml of resin and 
allowed to bind for 1 hour at 4° C. Protein-resin mix was 
then poured into a column, washed twice in 4 ml of wash 
30 buffer (2.5 ml IM phosphate buffer pH8 , 6.25 ml 4M NaCl, 
1 ml IM Imidazole pH8, 0.5 ml 10% Tween 20;. made up to 50 
mis in n.HaO) and then eluted in 4x 0.5 ml fractions with 
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eiution buffer (250 |J.1 1M Phosphate Buffer pH8, 625 p,l 4M 
NaCl, 1.25 mi IM Imidazole pH8, 50 jllI 10% Tween 20, Made 
up to 5 mis in n.HaO) . Fractions containing purified 
protein were detected by SDS-Page and Western blotting 
■ 5 using an S~tag HRP conjugate (Novagen) . Fractions 
containing purified recombinant protein were concentrated 
using YMIO coliamns (Millipore) 

Figure 3A shows the induction of recoit±>inant 2 031 OR 
10 expression by IPTG over 24 hours. Protein samples were 
taken at time points, run on an SDS-PAGE gel and stained 
with coomassie. By 1 hr a band of the correct size was 
clearly induced compared to the uninduced samples. The 
amount of protein increased with longer induction times.- 
15 Figure SB shows a coomassie stained gel of the purified 
recombinant 2031 OR. Alternative expression systems can be 
used for expression in bacteria, such as the glutathione 
S-transf erase or "mannose-binding fusion-protein system. 

20 Recombinant fragments of other 2031 ORs can be generated 
using the primer pairs ' and templates described in Table 
VIII, or similar primers and other 2031 OR listed in Table 
. III. 

25 Table VIII. Primer pairs for the recombinant expression of 
2 031 OR family proteins 



Species 


Template 


Primer A 


Primer B 


A. fumigatus 


SEQ ID No. 2 


SEQ ID No. 55 


SEQ ID No- 56 


A. fumigatus 


SEQ ID No. 5 


SEQ ID No. 57 


SEQ ID No. 58 


A. fumigatus 


SEQ ID No, 7 


SEQ ID No. 59 


SEQ ID No. 60 


A. nidulans 


SEQ ID No. 9 


SEQ ID No. 61 


SEQ ID No. 62 


C. ablicans 


SEQ ID No. 11 


SEQ ID No. 63 


SEQ ID No. 64 


M. grisea 


SEQ ID No. 21 


SEQ ID No. 65 


SEQ ID No. 66 
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Example 7. Oxidoreductase assay and inhibitor screening 

The assay for 2031 OR is based on methods described by 
5 Abramovitz & Massey (1976, J. Biol. Chem, 251: 5321-5326) 
and Stott et al. (1993, J, Biol. Chem. 268: 6097-6106) 
and is based upon the ability of this enzyme to oxidise 
the pyridine nucleotides NADH and/or NADPH. The peak of 
absorbance for the reduced form of these cof actors (i.e. 

10 NADH and NADPH) is at a wavelength of 340 nm whereas the 
oxidised forms of the cof actors (i.e. NAD"** and NADP"^) do 
not absorb at this wavelength. Conversion of NAD(P)H to 
NADCP)"^ can therefore be monitored spectrophotometrically 
at a wavelength of 340 nm. A similar assay can be employed 

15 for all oxidoreductases that use NADH or NADPH as a 
cof actor . 

Assays were carried out in 96-well plates. To each well 
. was added the following; Recombinant 2031 OR (10-1000 ng) ; 

20 40 |Lil of 125-2500 jiM NADPH; 1 ^jL 100 mM cyclohexeneone or 
other substrate, and the volume made up to 2 00 p.L with 
0.1 M potassium phosphate pH 7.0. Samples were incubated 
at room temperature and absorbance measurements were taken 
at 34 0 nm every 30 seconds for 10 min. The change in 

25 absorbance was expressed as .nmoles NADPH oxidised, using 
the molar- extinction coefficient of NADPH and NADH at 
340nm of 6270 (i.e., a IM solution has an optical density 
of 6270 at this wavelength) , 

30 Initial experiments with a variety of potential substrates 
for recombinant 2031 OR showed that the protein had a 
functional dehydrogenase activity and determined that 
cyclohexenone was a better substrate than menadione. 
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duroquinone or N-ethylmaleimide • This is illustrated in 
figure 5. Final concentrations in the assay were as 
follows: 500 |LiM substrate, 1 |ag/200 |aL 2031 OR, 120 |LiM 
NADPH . 

5 

Although the physiological substrates of 2031 OR remain to 
be determined, generic oxidoreductase substrates such as 
f erricyanide, methylene blue, phenazine . methosulphate and 
2, 6-dichlorophenolindophenol may also be used to assay for 

10 oxidoreductase activity. 

Screens for inhibitors of 2031 OR can be carried out using 
the assay described above modified by the addition of 
putative inhibitor substances to the reactions and 
decreasing the amount of potassium phosphate buffer. 

15 Assays can be carried out in 384- or 1536-well plates to 
increase throughput of the screen. 

Example 8 . Method for detecting fungal infection 

20 

The sequences described in the invention were exploited 
to diagnose fungal infections. Samples from patients 
potentially carrying an infection with A. fumigatus, A. 
nidulans ^ or C. albicans or rice leaves or stem 

25 potentially infected with M. grlsea^ or of alfalfa 

infected with C. trlfoliif or wheat infected with F. 
graminearuin/ F. sporotrichloldes , or M. graminicola , or 
other organisms, are processed to extract DNA using the 
DNAeasy Tissue kit or QIAamp DNA Blood Mini kit(Quiagen, 

30 Crawley, UK) , although other DNA preparation methods are 
available and suitable • 
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Once DNA has been prepared, • PGR reactions are set up as 
follows : 

Reaction mix: 

12.5 p.1 2x ReddyMix PGR mastermix (ABgene) 

1 p.1 primer A (5 pmol) 

1 ]xl primer B (5 pmol) 

5 ]il template DNA 

5.5 pi nuclease-free water 

Salable primer pairs are given in the table IX below: 



Table IX. Primer pairs for PGRs to diagnose fungal 
infection. 



Species 


Template 


Primer A*^ 


Primer B"*" 


A. fumigatus 


SEQ ID No. 1 


SEQ ID No, 67 
(94) 


SEQ ID No.- 68 
(286) 


A. fumlgatus 


SEQ ID No. 4 


SEQ ID No, 69 
(239) 


SEQ ID No. 70 
(450) 


A. fumlgatus 


SEQ ID No. 7 


SEQ ID No, 71 
(1097) 


SEQ ID No. 72 

(1271) 


C. abllcans 


SEQ ID No. 11 


SEQ ID No. 73 

(103) 


SEQ ID No. 74 
(277) 


M. grlsea 


SEQ ID No. 20 


SEQ ID No. 75 

(385) 


SEQ ID No, 76 
(620) 



15 Figures in brackets after SEQ ID No. indicate the base in 
the template at which the primer starts . 



Appropriate controls include; (i) template DNA but no' 
primers; primers but no template, (negative controls); (ii) 
20 cDNA encoding fungal 2031 OR or DNA from cultured fungi 
instead of patient DNA (positive control) . 

PGR reactions are run as follows : 
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Stepl 


95°C 


5 min 


Step2 


95°C 


1 min 


Step3 


53°C 


1 min 30sec 


Step4 


72°C 


1 min SOsec 


Steps 


72°C 


10 min 


Step6 


4°C 


Hold 



30 cycles of steps 2-4 are carried out and the PGR 
10 products examined on agarose gels. The production of a 
band of the correct molecular weight is diagnostic of the 
presence of the particular fungus. It may be additionally 
necessary to carry out diagnostic restriction digests of 
the PGR products. If necessary, PGR products are subcloned 
15 into a vector, such as pGEM-Teasy (Promega) , and sequenced 
to verify that the PGR products are from the appropriate 
fungus . 

Alternatively, the presence of an infection with A. 

20 fumlgatusr A. nldulans ^ C. albicans or M. grlsBa^ C- 
trlfolll, F. graminearumr F. sporotrlchloldes or M. 
gramlnlcola, or other organisms is detected by means of 
antibodies raised against the fungal protein. One suitable 
means is the use of a capture ELISA. Here, microtitre 

25 plates are coated with a monoclonal antibody raised 
against the fungal protein. Then the plates are incubated 
with diluted patient samples, or appropriate protein 
extracts of samples (particularly if the samples are 
biopsies or plant tissues) . Plates are then incubated with 

30 a polyclonal antibody (again against the fungal protein) . 
Finally, binding of the second ^tibody was detected by 
means of an enzyme-coupled or f luorescently-labelled 
antibody directed against the polyclonal- In practise, two 
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monoclonal or polyclonal antibodies or various 
combinations may be used. 

Example 9. Production of an antibody 

5 

Antibodies against the fungal 2031 ORs will be of 
considerable use as diagnostic reagents (see example 8 
above) . As an immunogen, recombinant domains are used (as 
described in Example 6) . Alternatively, synthetic proteins 
10 encoding regions either unique to the individual 2031 ORs, 
or likely to provide cross-reactivity within a set of ORs, 
a set of species, or a range of genera are used. Peptides 
may need to be conjugated to carrier proteins before 
immuni z at i on . 

15 

Preimmune sera from animals to be immunised are screened 
against the immunogen to ensure that there is no 
endogenous cross reactivity. Animals (typically sheep, 
rabbits or mice) are then immunised. For polyclonal 

2 0 antibody production, the resulting sera is affinity 
purified using the immunogen c.ross-linked to a 
chromatography matrix. Alternatively, purification of the 
antibody fraction from the serum, e.g. using protein G or 
protein A cross-linked to a matrix, may be sufficient. 

25 Monoclonal antibody production proceeded by methods 
familiar to those skilled in the art. 

The specificities of the resulting polyclonal and/or 
monoclonal antibodies are checked by ELISA and/or western 
30 blotting using the immunogen, related constructs or whole 
cell lysates and extracts as targets. Negative controls, 
such as other ORs, different constructs or different 
species are also employed to test specificity and/or to 
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determine the range of species and/or genus cross- 
reactivity. 

Example 10. Production of fungi with 2031 OR genes 
5 functionally disabled. 

A BAG (bacterial artificial chromosome) clone library 
containing the A. fumigatus genome, partially digested 
• with BamHJ and inserted into the vector pBACe3-6 was 

10 purchased from the Sanger Centre, Cambridge, UK. The BAC 
clone containing the gene to be inactivated is identified 
by bioinf ormatics (BLAST searching of Sanger BAC and 
related databases) and the glycerol stock of the clone 
grown up in 50 ml LB, 20 |Lig/ml chloramphenicol at 37*^C 

15 overnight. The overnight culture is centrifuged at 4,500 
rpm for 15 min. The bacterial pellet is resuspended in 4 
ml of Buffer Pi (Qiagen plasmid miniprep kit) and then 4 
ml of buffer P2 (Qiagen plasmid miniprep kit, lysis 
buffer) is added and mixed gently by inverting 3-6 times. 

20 Proteins and genomic DNA are precipitated by adding 4 ml 
of buffer P3 (Qiagen plasmid miniprep kit, neutralizing 
buffer) and incubating on ice for 10 minutes. Following 
the centrif ugation of the mixture at 45 00 rpm for 30 min, 
the supernatant is transferred into a 50 ml falcon tube, 

25 an equal volume of phenol /chlorophorm (1:1) mixture is 
added, and the mixture centrifuged for 15 min at 4500 rpm. 
The supernatant is then transferred into an Oakridge tube 
and 0.7 volumes isopropanol are added'. After mixing, the 
tube is centrifuged at 10,000 rpm (Beckman centrifuge, 

30 rotor JA-17) for 30 min at 4°C. The resulting pellet is 
washed with 2 ml 70% ethanol at the same speed. The 
resulting BAC DNA is resuspended in 100 p,l buffer EB. 
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The transposition reaction is carried out as follows. 7 |il 
purified BAG, 1 ]Lil transposon pZVK2 (an engineered plasmid 
the sequence of which is given as' SEQ ID No. 81), 
containing the mosaic ends of pM0D2 (Epicenter) , a 
5 kanamycin resistance gene and a Zeocin resistance gene 
under the control of fungal promoter) and 1 |al EZ:TN 
transposase (Epicenter)- are incubated at 37 °C for two hrs 
after which 1 |al stop solution (1% SDS) is added and the 
mixture heated to 70°C for 10 minutes. Electrocompetent 
10 GeneHogs E. coll cells (Invitrogen) are then transformed 
with the transposed BAG, the cells plated onto LB agar, ' 25 
p,g/ml kanamycin, 20 |Lig/ml chloramphenicol, and plates 
incubated overnight at 37 °C. 

15 At least 96 colonies are picked and grown up in 96-well 
plates in 2xLB (double concentrated ' LB) , 20 (xg/ml 
chloramphenicol, at 37*'C overnight. BAG DNA is then 
purified using the Millipore montage 96 BAG KIT using a 
MWG ROBOSEQ 4200' robot. BAGs containing the transposon 

20 inserted into the gene of interest are identified by PGRs 
both spanning the gene of interest and extending from the 
transposon into the BAG. Insertion into the gene of 
interest is manifested as an increase in product size. 
Southern blots are also carried out to ensure that the 

25 transposon has only inserted once into the BAG, 

The BAG is then linearised ' using a restriction enzyme 
determined to cut in the vector backbone but not the BAG 
DNA, and used to transform A. fumlgatus strain Af 293 . A. 
30 firnilgatus (haploid) protoplasts are prepared using 5% 
Glucanex (Novo Nordisk A/S) solution (in 0.6 M KGl) and 
shaking for 2 h at 8 0 rpm in 3 0°G , The protoplasts were 
washed with 0.6 M KGl and then with STC (Sorbitol, Tris, 
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CaCla) - The washed protoplasts . are diluted in STC to 10^/ml 
and. 100 fxl transferred into 14 ml falcon tubes. 7 p,l of 
linearised BAG are added to the tube and the whole mixture 
incubated on ice for 20 min. Transformation is carried out 
by adding 200 fxl of PEG 8000 solution (60%w/v, pH 7.5) 
drop-wise over 2 min and then adding 8 00 |li1 PEG. The 
mixture is left at room temperature for 20 min. 
Transformed protoplasts are washed with STC, resuspended 
in 1 ml STC, spread onto CM- sorbitol- Zeocin (250 ixg/ml) 
plates and incubated at 37 °C. 

After 4-10 days, of incubation, zeocin resistant colonies 
are picked and checked for presence of the knocked-out 
gene by PGR using primers which specifically amplify the 
whole gene of interest. Usually 10-20 transf ormants are 
checked. The ectopic integration of the BAG gives two 
bands by PGR, one for the endogenous gene and one for the 
BAC/transposon construct, which has a higher molecular 
weight. Replacement of the endogenous gene with the 
transposon-modified gene results in a single band of 
higher molecular weigh by PGR. If none of the 
transf ormants show the disrupted endogenous gene, the gene 
of interest may be .essentii^l, with the knock-out cells 
having died and only cells where replacement is 
unsuccessful surviving. In this case, the transformation 
is carried out on diploids using the same method of 
transformation. Essentiality of the gene is then tested by 
rehaploidisation, and examining the segregation pattern in 
haploids . 

The reader's attention is directed to all papers and 
documents which are filed concurrently with or previous to 
this specification in connection with this application and 
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which' are open to public inspection with this 
specification, and the contents of all such papers and 
documents are incorporated herein by reference. 

All of • the features disclosed . in this specification 
(including any accompanying claims, abstract and 
drawings), and/or all of the steps of any method or 
process so disclosed, may be combined in any combination, 
except combinations where at least some of such features 
and/or steps are mutually exclusive. 

Each feature disclosed in this specification (including 
any accompanying claims, abstract and drawings), may be 
replaced by alternative features serving the same, 
equivalent or similar purpose, unless expressly stated 
otherwise. Thus,, unless expressly stated otherwise, each 
feature disclosed is one example only of a generic series 
of equivalent or similar features . 

The invention is not restricted to the details of the 
foregoing embodiment ( s ) . The invention extends to any 
novel one, or any novel combination, of the features 
disclosed in this specification (including • any 
accompanying claims, abstract and drawings), or to any 
novel one, or any novel combination, of the steps of any 
method or process so disclosed. 
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Seqxience Lis ting 

SEQ ID No 1 

5 GTTCGACGTCATTGCCACGTTTCGACCCAAGGGCAGACGCCATGTCGCCGAGCGATCGCCGCGATATGCCTCGAATT 
TGCGCCATTCGGCATCCAGTTTCCAGTGCCCTTCCCCGAAT6ACTGTCTCCACTATTCGGCAAGATTGTAAATCAAG 
CCTGAAGAAGCGGAGCAATTCTTGGAAGTCGTATGTTCTACTGATTTCTGTGCCTGGCGCAGACGGGTATATAAATA 
AAGATCACCGCACCGAGGAGTTTCTTACCAACCCATCAATAACCATCCACAATCTCCTACAACAAAAATGACTGTCG 
CCGATATCGACGTTCCTCCTGCCGAGGGCATCCCCTACTTCACTCCGGCCCAGAACCCTCCTGCCGGTACGGCAGCT 

10 AACCCCCAGACCAATGGCCAGAAGATCCCCAAGCTCTTCACGCCCTTGACCATCCGTGGCGTCACCTTCCAGAACCG 
CCTTGGTGTAAGTCCGTTTGCCCTTGCTCATATCGACGAAAGCTAATCCCCCGTCAGCTCGCGCCCCTCTGCCAATA 
CTCCGCCCAGGACGGCCACATGACCGACTACCACATCGCCCATCTGGGTGGGATCGCCCAACGCGGACCCGGCCTGA 
TGCTGATTGAGGCGACCGCCGTCCAGCCCGAAGGCCGCATCACCCCTCAGGATGTCGGTCTGTGGAAGGACTCCCAG 
* ATCGCCCCGATGCGCCGGGTCATCGACTTCGTGCACAGCCAGGGCCAGAAGATCGGCGTGCAGCTTGCCCATGCCGG 

15 CCGGAAAGCCACCACCGTTGCGCCCTGGATCTCATTCTCGGCCATCGCGACGGAGAAGGTCGGCGGATGGCCGGACC 
GCGTCAAAGGGCCCGGCGATATCCCCTTTGCGGAGCCCTTCGCCAAGCCCAAGGCCATGACGCTGGATGAGATCGAG 
CAGTTCAAGAAGGACTGGGTGGCGGCCACGAAGCGCGCCATCGCCGCCGGTGCGGACTTTGTCGAGATTCACAATGC 
GCATGGATACCTGCTGTCGTCATTCCTCTCGCCGGCCGCCAACAACCGCACGGACCAGTACGGCGGGTCGTTCGAGA 
ACCGCATCCGGCTGTCTCTCGAGATTGCGCAGTTGACTCGGGACGCCGTCGGCCCTCATGTGCCCGTTTTCCTGCGC 

20 ATTTCGGCCTCGGACTGGTGCGAGGA(3ACCCTGCCGGAGCAGAGCTGGAAGTCGGAGGATACCGTGCGGTTCGCGCA 
GGAGCTGGTCAAGCAGGGCGCCGTTGATCTGATCGATATCAGCAGCGGTGGTGTTCTCGCGCAGCAGAAGATCAAGT 
CCGGCCCTGCCTTCCAGGTGCCTTTTGCCGTGGCCGTGAAGAAGGCCGTCGGCGACAAGCTGCTGGTTGCCGCCGTG 
GGTGCCATCACCAACGGCAAGCAGGCGAATCAGATTCTAGAGGAGCAGGATATCGACGTTGCGCTGGTTGGCCGTGG 
GTTCCAGAAGGATCCCGGTCTGGCCTGGACGTTTGCTCAGCACCTCGGCGTCGAAATCTCCATGGCCAACCAGATCC 

25 GCTGGGGCTTCACCCGGCGTGGAGGCACCCCGTACATTGATCCTTCGGTGTACAAGCAGTCTATTTTCGATGTATAG 
AGTATAGATAGAGTTGAAGATGATACCTCATAGACGATCAATGGACCCTTGCATATTATTTCTCGTCTCCTGCGTAT 
GTTCAAGGTATTCACAGTAGCTGCGTCCTCTTAAGTTTCTCCGTCATTCGTTCTATTCTACTCCAATCGCAACGCAT 
GGCGACCACGGATCGAGTCGAATTTCTCCGTCGTTCGTATCTGATCAATATAAAAAGCGGGGAATGGCTTGACCCCG 
CGCAGAATGTCGATCTCTTCGCAAACTCTCGGTGTATAGGACGCTCAGCAACGATCAAGG 

30 

SEQ ID No 2 

GTATGTTCTACTGATTTCTGTGCCTGGCGCAGACGGGTATATAAATAAAGATCACCGCACCGAGGAGTTTCTTACCA 
35 ACCCATCAATAACCATCCACAATCTCCTACAACAAAAATGACTGTCGCCGATATCGACGTTCCTCCTGCCGAGGGCA 
TCCCCTACTTCACTCCGGCCCAGAACCCTCCTGCCGGTACGGCAGCTAACCCCCAGACCAATGGCCAGAAGATCCCC 
AAGCTCTTCACGCCCTTGACCATCCGTGGCGTCACCTTCCAGAACCGCCTTGGTCTCGCGCCCCTCTGCCAATACTC 
CGCCCAGGACGGCCACATGACCGACTACCACATCGCCCATCTGGGTGGGATCGCCCAACGCGGACCCGGCCTGATGC 
TGATTGAGGCGACCGCCGTCCAGCCCGAAGGCCGCATCACCCCTCAGGATGTCGGTCTGTGGAAGGACTCCCAGATC 
40 GCCCCGATGCGCCGGGTCATCGACTTCGTGCACAGCCAGGGCCAGAAGATCGGCGTGCAGCTTGCCCATGCCGGCCG 
GAAAGCCACCACCGTTGCGCCCTGGATCTCATTCTCGGCCATCGCGACGGAGAAGGTCGGCGGATGGCCGGACCCGC 
GTCAAAGGGCCCGGCGATATCCCCTTTGCGGAGCCCTTCGCCAAGCCCAAGGCCATGACGCTGGATGAGATCGAGCA 
GTTCAAGAAGGACTGGGTGGCGGCCACGAAGCGCGCCATCGCCGCCGGTGCGGACTTTGTCGAGATTCACAATGCGC 
ATGGATACCTGCTGTCGTCATTCCTCTCGCCGGCCGCCAACAACCGCACGGACCAGTACGGCGGGTCGTTCGAGAAC 
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CGCATCCGGCTGTCTCTCGAGATTGCGCAGTTGACTCGGGACGCCGTCGGCCCTCATGTGCCCGTTTTCCTGCGCAT 
TTCGGCCTCGGACTGGTGCGAGGAGACCCTGCCGGAGCAGAGCTGGAAGTCGGAGGATACCGTGCGGTTCGCGCAGG 
AGCTGGTCAAGCAGGGCGCCGTTGATCTGATCGATATCAGCAGCGGTGGTGTTCTCGCGCAGCAGAAGATCAAGTCC 
GGCCCTGCCTTCCAGGTGCCTTTTGCCGTGGCCGTGAAGAAGGCCGTCGGCGACAAGCTGCTGGTTGCCGCCGTGGG 
5 TGCCATCACCAACGGCAAGCAGGCGAATCAGATTCTAGAGGAGCAGGATATCGACGTTGCGCTGGTTGGCCGTGGGT 
TCCAGAAGGATCCCGGTCTGGCCTGGACGTTTGCTCAGCACCTCGGCGTCGAAATCTCCATGGCCAACCAGATCCGC 
TGGGGCTTCACCCGGCGTGGAGGCACCCCGTACATTGATCCTTCGGTGTACAAGCAGTCTATTTTCGATGTATAGAG 
TATAGATAGAGTTGAAGATGATACCTCATAGACGATCAATGGACCCTTGCATATTATTT 



10 

SEQ ID No 3 

MTVADIDVPPAEGIPYFTPAQNPPAGTAANPQTNGQKIPKLFTPLTIRGVTFQNRLGIAPLCQYSAQDGHMTDYHIA 
HLGGIAQRGPGLMLIEATAVQPEGRITPQDVGLWKDSQIAPMRRVIDFVHSQGQKIGVQLAHAGRKATTVAPWISFS 
1 5 AIATEKVGGWPDRVKGPGDI PFAEP FAKPKAMTLDE lEQFKKDWVAATBCRAIAAGADFVE IHNAHGYLLS S FLS PAA 
NNRTDQYGGSFENRIRLSLEIAQLTRDAVGPHVPVFLRISASDWCEETLPEQSWKSEDTVRFAQELVKQGAVDLIDI 
SSGGVIAQQKIKSGPAFQVPFAVAVKKAVGDKLLVAAVGAITNGKQANQILEEQDIDVALVGRGFQKDPGLAWTFAQ 
HLGVEISMANQIRWGFTRRGGTPYIDPS VYKQS I FDV 

20 SEQ ID No 4 

atgtcgcaacctgttgtgcctgacatcgagaacaaacccgcgccgggtatctcgtactttactccggcgcaagagcc 
gcctgctggcaccgctgctaatcctcagtctgatggatcggcacctcccaagctcttccggccgctttcggtgcggg 
gtctgacctttcacaatcgcattggcgtgagtgcagtccaggcaattatgctatccatcctatgcgagcccttgcat 

25 tggaacagccgcttacagggaatgataatgagtagctatcgccactctgccaatactcagccgacgatggacacatg 
actccctggcatatggcacatcttggagggattgcccagcgagggccaggattcttgatggtcgaggcaacagcagt 
cgaaccggaaggcaggatcaccccgcaggacctgggactatggaaagactcgcagattgagccattgagccgcgtga 

* tcgagtttgtccacagtcagaaccagcttatcggcgtgcagatcgcacacgcaggtcgcaaggccagcaccgtcgcg 
ccatggctctcggccaacgataccgcctccgagaagatgggcggctggccaggccgcgtcaaaggcccgacaaatgt 

30 gcccttcaccgttaagaaccctgtgccgaaggagatgaccaagcaggatatcgaggatctgaagaccgcctgggtgg 
ccgctgtcaaacgggctgttaaggccggagccgactttatcgagatccacaatgcgcatggctatcttctgatgtcg 
ttcctctcccctgcggtcaacacgagaacagacgagtacggaggcagttttgagaatcgcatccggctcagtctgga 
gatcgccaagctcacccgcgaaaatgtgcccaaggatatgcctgtcttcctgcgggtctccgccaccgattggctgg 
aggaggtgcagccgaacaagcccagctggcgaggcgtggacactgtccgatttgcgaagatcctggcagaaacgggt 

35 tacgttgacgtgcttgacgtgagcagtggcggcactcattcggagcagcatatccacgcgaagccaggcttccaggc 
accctttgctattgccgtcaagaacgccgtcggggacaaactcgcagtggcatcagtgggtatgattgccagcgcgc 
atttggccaattccttgttggagaaggacggactggaccttgtgctggttggacgtggcttccagaagaacccgggg 
ctggtgtgggcgtgggccgacgagctgaatgtagagatctccatggctaatcagatccgatggggtttctcgcggcg 
cggtgctggtccttacctcaggaagaaactcgagaagatataa 

40 

SEQ ID No 5 

ATGTCGCAACCTGTTGTGCCTGACATCGAGAACAAACCCGCGCCGGGTATCTCGTACTTTACTCCGGCGCAAGAGCC 
GCCTGCTGGCACCGCTGCTAATCCTCAGTCTGATGGATCGGCACCTCCCAAGCTCTTCCGGCCGCTTTCGGTGCGGG 

45 GTCTGACCTTTCACAATCGCATTGGCCTATCGCCACTCTGCCAATACTCAGCCGACGATGGACACATGACTCCCTGG 
CATATGGCACATCTTGGAGGGATTGCCCAGCGAGGGCCAGGATTCTTGATGGTCGAGGCAACAGCAGTCGAACCGGA 
AGGCAGGATCACCCCGCAGGACCTGGGACTATGGAAAGACTCGCAGATTGAGCCATTGAGCCGCGTGATCGAGTTTG 
TCCACAGTCAGAACCAGCTTATCGGCGTGCAGATCGCACACGCAGGTCGCAAGGCCAGCACCGTCGCGCCATGGCTC 
TCGGCCAACGATACCGCCTCCGAGAAGATGGGCGGCTGGCCAGGCCGCGTCAAAGGCCCGACAAATGTGCCCTTCAC 

5 0 CGTT AAGAACCCT GTGCCGAAGGAGAT GACCAAGCAGGAT AT CGAGGATCT GAAGAC CGCCTGGGT GGCCGCT GT C A 
AACGGGCTGTTAAGGCCGGAGCCGACTTTATCGAGATCCACAATGCGCATGGCTATCTTCTGATGTCGTTCCTCTCC 
CCTGCGGTCAACACGAGAACAGACGAGTACGGAGGCAGTTTTGAGAATCGCATCCGGCTCAGTCTGGAGATCGCCAA 
GCTCACCCGCGAAAATGTGCCCAAGGATATGCCTGTCTTCCTGCGGGTCTCCGCCACCGATTGGCTGGAGGAGGTGC 
' AGCCGAACAAGCCCAGCTGGCGAGGCGTGGACACTGTCCGATTTGCGAAGATCCTGGCAGAAACGGGTTACGTTGAC 

55 GTGCTTGACGTGAGCAGTGGCGGCACTCATTCGGAGCAGCATATCCACGCGAAGCCAGGCTTCCAGGCACCCTTTGC 
TATTGCCGTCAAGAACGCCGTCGGGGACAAACTCGCAGTGGCATCAGTGGGTATGATTGCCAGCGCGCATTTGGCCA 
ATTCCTTGTTGGAGAAGGACGGACTGGACCTTGTGCTGGTTGGACGTGGCTTCCAGAAGAACCCGGGGCTGGTGTGG 
GCGTGGGCCGACGAGCTGAATGTAGAGATCTCCATGGCTAATCAGATCCGATGGGGTTTCTCGCGGCGCGGTGCTGG 
TCCTTACCTCAGGAAGAAACTCGAGAAGATATAA 

60 

SEQ ID No 6 
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MSQPVVPDIENKPAPGISYFTPAQEPPAGTAANPQSDGSAPPKLFRPLSVRGLTFHNRIGLSPLCQYSADDGHMTPW 
HMAHLGGIAQRGPGFLMVEATAVEPEGRITPQDLGLWKDSQIEPLSRVIEFVHSQNQLIGVQIAHAGRKASTVAPWL 
SANDTASEKMGGWPGRVKGPTNVP FT VKNP VPKEMTKQDIEDLKTAWVAAVKRAVKAGAD FIE I HNAHGYLLMS FLS 
5 PAVNTRTDEYGGSFENRIRLSLEIAKLTRENVPKDMPVFLRVSATDWLEEVQPNKPSWRGVDTVRFAKILAETGYVD 
VLDVSSGGTHSEQHIHAKPGFQAPFAIAVKNAVGDKLAVASVGMIASAHLANSLLEKDGLDLVLVGRGFQKNPGLVW 
AWADELNVEISMANQIRWGFSRRGAGPYLRKKLEKI 

. SEQ ID No 7 

10 

ATGGGTTCCAACGCCTTCCGGTCCCCCGCCGTCACCAAGTCCTCCTCCACCCCCTACTACACTCCCGCCAACAATGG 
AGGCGCCGCCCTGCACCCCGACGACCCCACGACCCCTACGCTCTTCCGGCCCTTACAAATCCGCAATGTGACGCTCA 
AGAACCGCATCATGGTGTCGCCCATGTGCATGTACTCCTGCGAGTCGGACCCGTCGTCTCCCCACGTCGGCGCCCTA 
ACAAACTACCACCTGGCGCATCTGGGCCACCTCGCCCTCAAAGGCGCAGGCCTCGTCTTCATCGAAGCCACCGCCGT 

15 GCAGCCCAACGGGCGCATCTCCCCCAACGACTCGGGCCTCTGGCAGGACGGCACCACCTCGGAACAATTCCTGGGGC 
TGAAGCGGGTCGTCGAGTTCATGCACGCACAGGGCGCCAAGGTCGGGATCCAGCTTGCGCATGCGGGCCGGAAAGCG 
AGTGCCGTTGCGCCGTGGCTGGCGGCGCAGGCGGGCAAGTCGAGTCTGAAGGCGGATGAGAGCGTTGGCGGGTGGCC 
CGCGGATGTGGTGGGTCCGTCGGGCGGGGAGGAGCATATCTTTAGTCCCGAGGAGGATGCGTATTGGGTGCCGCGGG 
CGCTGAGCACGGCCGAGGTCCGTCAGGTGGTGGCGGCGTTTGCGAAGAGCGCGCGGCTAGCGGTGCAGGCTGGGGTG 

20 GATGTTATCGAGATCCATGGGGCGCATGGCTATCTCATCAACGAGTTCCTGAGCCCGGTCACGAATAAGCGGACGGA 
TGCGTACGGCGGGAGCTTTGAGAACCGGACCCGGATCGTGCGCGAGGTTGCGGCGGCTATTCGTGCGGTGATTCCCG 
AGGGGATGCCCCTGTTTCTGCGTATCAGCGCCACGGAGTGGTTGGAGGGTCAGCCGGTGGCCGCGGAGTCGGGCAGC 
TGGGATATGCAGAGCTCGCTGGAGCTGGTCAAGAAGCTGCCCGAATGGGGCATTGACCTGGTGGATGTCAGCTCCGC 
CGCGAACCACAAGGACCAGAAGATCAACCTGCACACGGCCTACCAGACGGATCCTGGCCGGGCAGATTCGCCAGGCCA 

25 TCCGAGCGGCTGGCGCGTCGACTCTTGTGGGTGCTGTAGGTCTGATCACCGATTCGGAACAGGCGAGGGGACTAGTT 
CAGGGAGCGGACGAGGCGACTGCAGCCGAGGCAATGCTGTCGGGACCTGAACCCAAGGCGGATGCCATTCTGATAGC 
CCGTCAGTTCCTGCGCGAGCCAGAATGGGTGTTTTCCACGGCGAGAAAGTTGGGCGTGCCGGTGACTGTCCCGGTGC 
AGTTTGGCAGGGCCATTTAG 

30 

SEQ ID No 8 

MGSNAFRS PAVTKS S ST PYYT PANNGGAALHPDDPTT PTLFRPLQIRNVTLKNRIMVS PMCMYSCES DPS S PHVGAL 
TNYHIAHLGHIiALKGAGLVFIEATAVQPNGRISPNDSGLWQDGTTSEQFLGLKRVVEFMHAQGAKVGIQLAHAGRKA 
35 SAVAPWIAAQAGKSSLKADESVGGWPADWGPSGGEEHIFSPEEDAYWPRALSTAEVRQWAAFAKSARLAVQAGV 
DVIEIHGAHGYLINEFLSPVTNKRTDAYGGSFENRTRIVREVAAAIRAVIPEGMPLFLRISATEWLEGQPVAAESGS 
■ WDMQSSLELVKKLPEWGIDLVDVSSAANHKDQKINLHTAYQTDIiAGQIRQAIRAAGASTLVGAVGLITDSEQARGLV 
QGADEATAAEAMLSGPEPKADAILIARQFLREPEWVFSTARKLGVPVTVPVQFGRAI 

40 SEQ ID No 9 

ATGGCTCTCCCTGACGTCGAAAACACCCCCGCCGCCGGCATCCCCTACTTTACACCAGCACAGAACCCTCCTGCTGG 
AACAGCTGCCAACCCGCAAACCAGCGGCAATGCCGTCCCCAAGCTGTACACACCTCTGACGGTGCGTGGGGTGACCT 
TCCACAACAGACTTGGCCTCGCGCCGCTCTGCCAGTACTCCGCAGAAGACGGCCACATGACAGACTACCACATCGCG 

45 CACTTGGGAGGTATTGCCCAGCGCGGCCCCGGTCTCAXGATGATCGAGGCAACCTCCGTCTCACCTGAAGGCAGAAT 
CACGCCGCAGGACGTCGGTTTATGGAAGGACTCGCAGATTGCGCCCATGAAGCGCGTCATCGACTTCGTGCACTCGC 
AGTCCCAGAAGATTGGCGTGCAGATTGCCCACGCCGGCCGCAAGGCTTCGAACATCGCCCCCTGGCTCATGAACAAG 
GGCATCGTCGCGACGGAGAAGGTCGGTGGCTGGCCGGATCGTGTGATCGGCCCGTCCACCGTGCCCTTCCACGAGAC 
TTTCCCCACCCCCAAGGCCATGACCAAGGACGACATCGAGCAGTTCAAGCGCGACTGGTTTGATGCGTGCAAGCGGG 

5 0 ' CCATTGCCGCTGGCGCGGACTTCATCGAGATCCACAATGCCCACGGGTATCTTCTCTCGTCTTTCCTATCACCGTCT 
TCCAACACGCGCACCGACGAGTACGGCGGCTCCTTTGAGAACCGCATCCGGCTCTCTCTCGAAATCGCCCAGGTCAC 
CCGTGACGCCGTCGGCCCCAACGTTCCTGTTTTTCTCCGTGTCTCCGCGACGGACTGGATCGAGGAGACCCTCCCCG 
AGGAATCGTGGAAGCTCTCTGACTCCGTCCGCTTCGCCGAAGCCCTCGCTGCCCAGGGCGCTATTGACCTGATCGAC 
GTCTCTTCCGGCGGTGTCCACGCCGCGCAGAAGATCAAGTCCGGGCCGGCTTTCCAGGCTCCCTTCGCTGTGGCTAT 

55 CAAGAAGGCCGTTGGCGATAAGCTCCTTGTTGCGACGGTGGGCACGATCACGAACGGTAAGCAGGCGAACAAGCTGC 
TTGAGGAGGAGGGATTGGATGTTGCGCTTGTGGGACGTGGTTTCCAGAAGGATCCCGGTCTGGCGTGGACTTTCGCG 
CAGCATCTTGATGTTGAGATTGCGATGGCGAGTCAGATTCGGTGGGGATTCACAAGGCGCGGGGGCACGCCTTATAT 
CGACCCCAAAGCTTATAAGGAGAGCATCTTTGAGTAA 

60 SEQ ID No 10 

MALPDVENTPAAGIPYFTPAQNPPAGTAANPQTSGNAVPKLYTPLTVRGVTFHNRLGLAPLCQYSAEDGHMTDYHIA 
HLGGIAQRGPGLMMIEATSVSPEGRITPQDVGLWKDSQIAPMKRVIDFVHSQSQKIGVQIAHAGRKASNIAPWLM^ 
GIVATEKVGGWPDRVIGPSTVPFHETFPTPKAMTKDDIEQFKRDWFDACKRAIAAGADFIEIHNAHGYLLSSFLSPS 
65 SNTRTDEYGGSFENRIRLSLEIAQVTRDAVGPNVPVFLRVSATDWIEETLPEESWKLSDSVRFAEALAAQGAIDLID 
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VSSGGVHAAQKIKSGPAFQAPFAVAIKKAVGDKLLVATVGTITNGKQANKLLEEEGLDVALVGRGFQKDPGLAWTFA 
QHLDVEIAMASQIRWGFTRRGGTPYIDPKAYKESIFE 

SEQ ID No 11 

5 

ATGACAGTTCCATACCAAGTAAAACCATCAGATGAAATCAAAGGTGCTCCTGAGGTTTCCTATTACACTCCAGAACA 
GCCTGTTCCGGCTGGTACTTTTTATCCCCAATCGTCAGATGAAGTTGCTCCCAAAATTTTTCAACCTTTAAAGATTG 
GTAAGCTTGCTTTGCCAAACAGAATTGGGGTATCTCCAATGTGTCAATATTCTGCTGATTATAATTTTGAAGCAACT 
CCATACCATTTAATCCATTATGGTTCATTAGTGAATCGTGGGCCAGGTATCACCATTGTTGAAAGCACGGCTGTTTC 

10 TCCTGAGGGTGGATTATCACCTCATGATTTAGGAATCTGGAAGGATGAACAAGCAGAGAAATTGAAACCAATTGTCG 
ATTACGCTCATTCTCAAAAGCAATTAATTGCCATCCAATTGGGCCATGGTGGTAGAAAAGCTTCTGGTCAGCCCTTA 
TTTTTGCACTTGGAACAAGTTGCAGATAAATCTGTCAATGGGTTTGCCGACAAAGCAGTTGCTCCTTCTGCATTGGC 
ATTCAGACCAAATGGTAATTTACCTGTTCCTAATGAGTTGACCAAAGATGAAATCAAACGTGTTGTTAAGGATTTTG 
GT GCTGCTGCTAGAAGAGCT GTTGAAATCAGTGGCTTTGAT GCAGTT GAGATT CATGGT GCTCAT GGTTATTTGATT 

15 AATGAGTTCTATAGTCCTATTTCAAACAAGAGAACAGATGAATACGGTGGCAGTTTTGAAAATAGAACCAGATTTTT 
AAAGGAAGTTATCGATAGTGTTAAATCAAGTATTCCAAACGATGTTCCAGTGTTTTTGAGAATCTCTGCTGCTGAAA 
ATAGTCCTGATCCAGAAGCTTGGACTATTGAAGATTCCAAAAAATTAGCTGACATTTTAGTAGAAAAGGGTATTGCT 
TTGGTTGATGTTTCATCTGGTGGTAACGATTATAGACAACCACCAAGATCTGGGATCAGTAAAGAGTTGAGAGAGCC 
AATCC AT GTT CCGTTGT CT CGT GCAATTAAACAACAT GTT GGTGACAAGTTATT GGT CAGTT GCGTTGGTGGGCTTG 

20 AAAAAGATCCTGAATTGCTCAACAAATATTTAGAAGAAGGAACATTTGATCTTGCTTTGATCGGTAGAGGATTTTTA 
AGAAATCCAGGTTTGGTATGGGAGTTTGCCGATAAACTTGGTGTTAGACTCCACCAGGCCTTGCAGTTAGGTTGGGG 
TTTCTGGCCCAACAAACAACAAATTGTTGATTTGATTGAAAGAACATCTAAATTAGAAGTAAATTAG 

SEQ ID No 12 

25 

MT VP YQVKPS DE IKGAPEVS YYT PEQP VPAGTFYPQS S DEVAPKI FQPLKIGKLALPNRI GVS PMCQYSADYNFEAT 
PYHLIHYGSLVNRGPGITIVESTAVSPEGGLSPHDLGIWKDEQAEKLKPIVDYAHSQKQLIAIQLGHGGRKASGQPL 
FLHLEQVADKSVNGFADKAVAPSALAFRPNGNLPVPNELTKDEIKRWKDFGAAARRAVEISGFDAVEIHGAHGYLI 
NEFYSPISNKRTDEYGGSFENRTRFLKEVIDSVKSSIPNDVPVFLRISAAENSPDPEAWTIEDSKKLADILVEKGIA 
30 LVDVSSGGNDYRQPPRSGISKELREPIHVPLSRAIKQHVGDKLLVSCVGGLEKDPELLNKYLEEGTFDLALIGRGFL 
RNPGLVWEFADKLGVRLHQALQLGWGFWPNKQQIVDLIERTSKLEVN 

SEQ ID No 13 

35 atggal^aacaacaatactataccggcattatttcaacccataaagatcagtgactcgatcacattacctaatagaat 
tggtgtttcaccaatgtgcatgtattcatcgtcaccaactgacaatcaagccactctgtttcattttgttcattatg 
gAtcatttgctgtacgtggaccagcattaatcattttagagagtatctttgtgtccgaaaattccggattatccatt 
catgatttaggtctttggaatgatgatcaagctcacagtttacggaaaattgttgattttattcatgatcaagacgg 
aatttgctgtatacaattgaatcacgctgggcgaaagattgtt.gaaggggtaccattccaacaaatacaacatggtt 

40 ggcaagaacattgtgtggggccatctactgagccatttagtgattcacacaatacaccacgagaattgactgttaat 
gaaata2\attcaattgtggaagactttgccaatgcagcttggcgggctgtggaaatctcaaaattcgatgccattga 
aatacattgtgctaatggatgtttaatacaccaatttttaagtaaattgacaaacaagagagctgaccaatacgggg 
gctcatttgaaaacagagttagatttcttttacaaataattgagaatataaaacgaaagatagaaacaccgattttc 
ttaaagtttccaatgtcagataattgtagtgatccggaagcgtggtctacggaagatgcattgaagttggccgatct 

45 tgttattgatttaggagtaaaggtgatcgacgttacatcaggtggaaatgttgcgcattgcaaatctagatatctat 
taaatgacgao^aacaactaccttctcaagtgcccttggctcgtaaattgaaaagccacattagaaac 
atcgcatgcagtggaggattagatcgagacatatttaaactcgatgagtttattgctaatggtgactttgatatagc 
attgataggtaaaggatttctcaaaaacactggattgatcagccgtattgctgaccaattgcaagcacaattcagaa 
cagcacctcaatataagttggccttatcataa 

50 

SEQ ID No 14 

mennntipalfqpikisdsitlpnrigvspmcmysssptdnqatlfhfvhygsfavrgpaliilesifvsensglsi 
hdlglwnddqahslrkivdfihdqdgicciqlnhagrkivegvpfqqiqhgwqehcvgpstepfsdshntpreltvn 
55 einsivedfanaawraveiskfdaieihcangclihqflskltnkradqyggsfenrvrfllqiienikrkietpif 
lkfpmsdncsdpeawstedalkladlvidlgvkvidvtsggnvahcksryllnddkqlpsqvplarklkshirnrcl 
iacsggldrdifkldefiangdfdialigkgflkntglisriadqlqaqfrtapqyklals 

SEQ id No 15 

60 

atggccgacttcacccagaagaagacctcctcccccgcggccccgggtgttcccttctacaccccggcccaggtccc 
cgccgccggcactcccctcccctccacccccggcgatgtccctactctcttcacccctctcaagatccgtggtgttg 
agctccagaaccgcttcgccgttgcgcccatgtgcacctactctgccgacgatggccacatgaccgactggcacctt 
gtccacctgggctccttcgccctccgcggtgtccccctcaccatcttcgaggccaccggcgtcctccccaacggccg 
65 catcacccccgagtgctctggtctctggcaggactcccagattgcgcccctcaagcgcatcgtcgactacatccact 
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CCCAGGGCCAGAAGGCCGGTATCCAGCTTGCCCACGCCGGCCGCAAGGCCTCCACCAAGGCCCCCTGGCACTACCAG 
CGCGGCAAGAGCGAGCTTGCCGGCCCCGAGCAGGGTGGCTGGCCCGAGAACGTCTGGGCCCCCAGCGCCATCAGCTA 
CAACGAGGAGACCTTCCCCTTCCCCAAGGAGATGACCGTCGAGCAGATCCACGAGCTCGTCGAGGCCTGGAAGGCGT 
CTGCCCAGCGTGCCCTCAAGGCCGGCTTCGACCTCATTGAGATCCACGCCGCCCACGGCTACCTCATTTCCGAGTTC 
5 TTGAGCCCCATCTCCAACCAGCGTACCGACCAGTACGGTGGCTCCTTCGAGAACCGCACCCGCGTTCTCCGCGAGAT 
CATCTCGGCCGTCCGCTCCGTCATCCCCGAGGACATGCCCCTCTTCGTCCGTGTCTCCGCCACCGAGTGGATGGAGT 
ACACCGGCCAGCCCTCGTGGGACCTCCAGCAGACCATTGAGCTCGCCAAGATCCTCCCCGACCTCGGCGTCGACCTC 
CTCGACGTCTCTTCCGGCGGCAACAACAAGGACCAGAAGATCAACGTCCACACCTACTACCAGATCGACATGGCCGA 
GCAGATCCGCGCGGCCGTGCACGAGGCCGGCAAGCAGCTCCTCGTCGGTGCCGTCGGCTTGGTCACCTCGGCTGAGA 
10 TCGCCAAGGAGACCGTCCAGGAGAAGGAGGATGGCAGAGTCACCATCCAGCGCGAGAACGGCGCCAAGACTCGTGCC 
GATATGGTCCTTGTTGCCAGGCAGTTCTTGAAGGAGCCCGAGTTCGTCCTCACTGTCGCCGACGAGTTGGGTGTTGA 
TGTCAAGGCCCCTGTTCAGTACCTCCGTGGTCCTCTTAGCAGCAGGCCCAAGAAGTTGACCACTGTTCCT-TAA 

SEQ ID No 16 

15 

MADFTQKKTSSPAAPGVPFYTPAQVPAAGTPLPSTPGDVPTLFTPLKIRGVELQNRFAVAPMCTYSADDGHMTDWHL 
VHLGSFALRGVPLTIFEATGVLPNGRITPECSGLWQDSQIAPLKRIVDYIHSQGQKAGIQLAHAGRKASTKAPWHYQ 
RGKSELAGPEQGGWPENWAPSAISYNEETFPFPKEMTVEQIHELVEAWKASAQRALBCAGFDLIEIHAM 
LSPISNQRTDQYGGSFENRTRVLREIISATOSVIPEDMPLFVRVSATElimEYTGQPSWDLQQTIEIAKILPDLGVDL 
20 LDVSSGGNNKDQKIIWHTYYQIDMAEQIRAAVHEAGKQLLVGAVGLVTSAEIAKETVQEKEDGRVTIQRENGAKTRA 
DMVLVARQFLKEPEFVLTVADELGVDVBCAPVQYIiRGPLSSRPKKLTTVP 

SEQ ID No 17 

25 atggctacttccactacctccgacctcaaactctcccaacccctcaccctccccaatggccttaccctccccaaccg 
cctcgtcaaagccgccatggccgaacaaatgggcttcggcaaccacctgcccaaccccgaactcgccgccgtctacg 
ccacctgggcccgcggcgactggggcctgattctcaccggcaacgtccaagtcgaccacgcgcacaagggcgacgcc 
cacgacatcagccccaaccaccccggcaccacgcccgagcagaccgtcacggccttcaaggcctgggcggacgccgc 
gcgcctgaatggccagtccaaaacgcctgtggtcgtgcagatcaaccaccctggtcgccagagtccgatgggcgcgg 

30 gcacgcggggactgtgggagaaggcggtggcgccctcgccggtgccgttggtgttgggagaggcgtttgtgcctcgc 
ttgttgtcgaaagtgcttttcggcacgccgcgggagctgacggttgcggagatcaaggatatcgtgcaaaagtttgc 
ggtgacggcgaggatcacggccgaggccgggttcaatggcgtggagatccatgcggcgcatggatacctgttggcgc 
agttcttgagcaagaagacaaacaggcgcggggatgagtatggcgggtcggctgag.aacagggcgaggattgttggg 
gagattattaaggagtgcaggaggcaggtgactgaggcggtgggtgaagaggaggcgaagaagtttgtggtgggaat 

35 caagctgaacagtgcggattggcaggcgggacgcgatggaaaggaggaggaggagacggatacggcggaggaggtgt 
tgaagcagattgagctttttgagcagtgggggatcgactttgtcgaggttagcggtggcagttatgaggatcctcag 
gtaagttttggtgttgtttgagggatggggcaaggggttgtctgtcgtgaacaacaaaaggggcacggaacaaatgc 
taacgccatacagatggccaacggtcccaagcccgaaaagtccgaacgcaccatggcccgcgaggccttcttcctcg 
agttcgccaagatcatccgcaccaagttccccaagcttcctctcatggtcaccggcggcttccgcactcgtcagggc 

40 atggaggccgctttggaatccgatgattgcgacatgatcggtatcggacgcccggccatcatcaacccttcgcttcc 
cgccaacttgatcctcaacccggaggtgccggatgcggatgcccgcttgttcgacaagaagagggctgagccgcact 
ggatcgttgagaagttgggcatgaagtccattgttggtgctggtgttgaggtggtacgtcacgttccaaccccattt 
gcttcattgtgtttccgagtatgtcatgctgacttggttcttttctagacgtggtatgtgagcgagctcaagaagct 
ggccaagttttag 

45 

SEQ ID No 18 

ATGGCTACTTCCACTACCTCCGACCTCAAACTCTCCCAACCCCTCACCCTCCCCAATGGCCTTACCCTCCCCAACCG 

50 CCTCGTCAAAGCCGCCATGGCCGAACAAATGGGCTTCGGCAACCACCTGCCCAACCCCGAACTCGCCGCCGTCTACG 
CCACCTGGGCCCGCGGCGACTGGGGCCTGATTCTCACCGGCAACGTCCAAGTCGACCACGCGCACAAGGGCGACGCC 
CACGACATCAGCCCCAACCACCCCGGCACCACGCCCGAGCAGACCGTCACGGCCTTCAAGGCCTGGGCGGACGCCGC 
GCGCCTGAATGGCCAGTCCAAAACGCCTGTGGTCGTGCAGATCAACCACCCTGGTCGCCAGAGTCCGATGGGCGCGG 
GCACGCGGGGACTGTGGGAGAAGGCGGTGGCGCCCTCGCCGGTGCCGTTGGTGTTGGGAGAGGCGTTTGTGCCTCGC 

55 TTGTTGTCGAAAGTGCTTTTCGGCACGCCGCGGGAGCTGACGGTTGCGGAGATCAAGGATATCGTGCAAAAGTTTGC 
GGTGACGGCGAGGATCACGGCCGAGGCCGGGTTCAATGGCGTGGAGATCCATGCGGCGCATGGATACCTGTTGGCGC 
AGTTCTTGAGCAAGAAGACAAACAGGCGCGGGGATGAGTATGGCGGGTCGGCTGAGAACAGGGCGAGGATTGTTGGG 
GAGATTATTAAGGAGTGCAGGAGGCAGGTGACTGAGGCGGTGGGTGAAGAGGAGGCGAAGAAGTTTGTGGTGGGAAT 
CAAGCTGAACAGTGCGGATTGGCAGGCGGGACGCGATGGAAAGGAGGAGGAGGAGACGGATACGGCGGAGGAGGTGT 

60 TGAAGCAGATTGAGCTTTTTGAGCAGTGGGGGATCGACTTTGTCGAGGTTAGCGGTGGCAGTTATGAGGATCCTCAG 
ATGGCCAACGGTCCCAAGCCCGAAAAGTCCGAACGCACCATGGCCCGCGAGGCCTTCTTCCTCGAGTTCGCCAAGAT 
CATCCGCACCAAGTTCCCCAAGCTTCCTCTCATGGTCACCGGCGGCTTCCGCACTCGTCAGGGCATGGAGGCCGCTT 
TGGAATCCGATGATTGCGACATGATCGGTATCGGACGCCCGGCCATCATCAACCCTTCGCTTCCCGCCAACTTGATC 
CTCAACCCGGAGGTGCCGGATGCGGATGCCCGCTTGTTCGACAAGAAGAGGGCTGAGCCGCACTGGATCGTTGAGAA 

65 GTTGGGCATGAAGTCCATTGTTGGTGCTGGTGTTGAGGTGACGTGGTATGTGAGCGAGCTCAAGAAGCTGGCCAAGT 
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TTTAG 



SEQ ID No 19 

5 

MATSTTSDLKLSQPLTLPNGLTLPNRLVKAAMAEQMGFGNHLPNPELAAVYATWARGDWGLILTGIWQVDHMKGDA 
HDI S PNHPGTTPEQTVTAFKAWADAARLNGQSKT PVWQINHPGRQS PMGAGTRGLWEKAVAPS PVPLVLGEAFVPR 
LLSKVLFGTPRELTVAEIKDIVQKFAVTARITAEAGFNGVEIHAAHGYLLAQFLSKKTNRRGDEYGGSAENRARIVG 
EIIKECRRQVTEAVGEEEAKKFWGIKLNSADWQAGRDGKEEEETDTAEEVLKQIELFEQWGIDFVEVSGGSYEDPQ 
10 MANGPKPEKSERTM2^EAFFLEFAKIIRTKFPKLPIiMVTGGFRTRQGMEAALESDDCDMIGIGRPAIINPSLPANLI 
LNPB VPDADARLFDKKRAEPHWIVEKLGMKS IVGAGVEVTWYVSELKKLAKF 



15 SEQ ID No 20 

atgtcggcagaaaagaagactttgagcaaaccggccgccggggtgccttactacaccccagcccaggagccgccggc 
agggacccctttgcagcagcaggacgccatcccaacgctgttcaagcctctgaagatccgtggcgtcgagctctcca 
accgctttggcgtctcgcccatgtgcacctactcagccgacgatggccacctgaccgacttccacttggtgcacctg 

20 ggccagttcgccctgcacggcacggccctgaccattgtcgaggccacatccgtcacgcccaacggacgcatctcgcc 
• cgaggacagcggcctgtggcaagacagccagatcgctcctctgcgccgcatcgtcgactacgtgcacagccagggcc 
aaaagatcgccatccaactggctcatgccggccgcaaggccagcacaaaggccccctggcacgactccttcaccccc 
agcggcgagtataagccgagagagggcttacaggtcgtcggacccgagtatggcggctggcctgatgacgtctgggc 
cccgagcgccatcccgttctcggaggactttccgaaccccaaggagatgaccgttgaggagattgagggactcgtca 

25 ccagctttgtggacgctgccaagcgtgccatcgaggccggcgtcgacattattgagattcacggcgctcacggttac 
ctgatcaccgagttcctttcgccgctatcaaacgtaagtggagatactttgtgtggggctgtgcgcatactccctcg 
ggtgtgacttctattaacattttatttcctggcacgcagaaacggacagacaagtacggcggcagctttgagaaccg 
cacccgggtcctgatcgatattatcaaggccgtccgggcagtgattcccgaggagatgccactcttcgtccgaatct 
ccgcgaccgaatggatggagtacgccggcgagcctagctgggacctcgagcagagcacacagcttgccaagctcctc 

30 ccggacctgggtgtcgacctgctcgacgtcagctcgggcggaaactcggtggcccaaaagatc'gagctcacgccgta 
ctaccagatcgacctggcagccaagatccgcgaggccgtcggcgataggttgctcataggcgcggtcggcaacatca 
acacggctgacattgcgcgcgatgtcgtggatgagcagggcgccgagaaggtggccgaggccaagcagacgcatgac 
accatcgaggtcgtgagcgaatcacatggcggcaagaccaaggcggatctggtcctcattgctcgccagttcctgcg 
cgagcctgagtttgtgctgaggacggcgcataaccttggggtcaatgtgcagtggcctcaccaataccacagagcag 

35 tgtggcgcaagggtgcaaggatttga 



SEQ ID No 21 

40 ATGTCGGCAGAAAAGAAGACTTTGAGCAAACCGGCCGCCGGGGTGCCTTACTACACCCCAGCCCAGGAGCCGCCGGC 
AGGGACCCCTTTGCAGCAGCAGGACGCCATCCCAACGCTGTTCAAGCCTCTGAAGATCCGTGGCGTCGAGCTCTCCA 
ACCGCTTTGGCGTCTCGCCCATGTGCACCTACa?CAGCCGACGATGGCCACCTGACCGACTTCCACTTGGTGCACCTG 
GGCCAGTTCGCCCTGCACGGCACGGCCCTGACCATTGTCGAGGCCACATCCGTCACGCCCAACGGACGCATCTCGCC 
CGAGGACAGCGGCCTGTGGCAAGACAGCCAGATCGCTCCTCTGCGCCGCATCGTCGACTACGTGCACAGCCAGGGCC 

45 AZU\AGATCGCCATCCAACTGGCTCATGCCGGCCGCAAGGCCAGCACAAAGGCCCCCTGGCACGACTCCTTCACCCCC 
AGCGGCGAGTATAAGCCGAGAGAGGGCTTACAGGTCGTCGGACCCGAGTATGGCGGCTGGCCTGATGACGTCTGGGC 
CCCGAGCGCCATCCCGTTCTCGGAGGACTTTCCGAACCCCAAGGAGATGACCGTTGAGGAGATTGAGGGACTCGTCA 
CCAGCTTTGTGGACGCTGCCAAGCGTGCCATCGAGGCCGGCGTCGACATTATTGAGATTCACGGCGCTCACGGTTAC 
CTGATCACCGAGTTCCTTTCGCCGCTATCAAACAAACGGACAGACAAGTACGGCGGCAGCTTTGAGAACCGCACCCG 

50 GGTCCTGATCGATATTATCAAGGCCGTCCGGGCAGTGATTCCCGAGGAGATGCCACTGTTCGTCCGAATCTCCGCGA 
CCGAATGGATGGAGTACGCCGGCGAGCCTAGCTGGGACCTCGAGCAGAGCACACAGCTTGCCAAGCTCCTCCCGGAC 
CTGGGTGTCGACCTGCTCGACGTCAGCTCGGGCGGAAACTCGGTGGCCCAAAAGATCGAGCTCACGCCGTACTACCA 
GATCGACCTGGCAGCCAAGATCCGCGAGGCCGTCGGCGATAGGTTGCTCATAGGCGCGGTCGGCAACATCAACACGG 
CTGACATTGCGCGCGATGTCGTGGATGAGCAGGGCGCCGAGAAGGTGGCCGAGGCCAAGCAGACGCATGACACCATC 

55 GAGGTCGTGAGCGAATCACATGGCGGCAAGACCAAGGCGGATCTGGTCCTCATTGCTCGCCAGTTCCTGCGCGAGCC 
TGAGTTTGTGCTGAGGACGGCGCATAACCTTGGGGTCAATGTGCAGTGGCCTCACCAATACCACAGAGCAGTGTGGC 
GCAA.GGGTGCAAGGATTTGA 

SEQ ID No 22 

60 

MSAEKKTLSKPAAGVPYYTPAQEPPAGTPLQQQDAIPTLFKPLKIRGVELSNRFGVSPMCTYSADDGHLTDFHLVHL 
GQFALHGTALTIVEATSVTPNGRISPEDSGLWQDSQIAPLRRIVDYVHSQGQKIAIQLAHAGRKASTKAPWHDSFTP 
SGEYKPREGLQWGPEYGGWPDDVWAPSAIPFSEDFPNPKEMTVEEIEGLVTSFVDAAKRAIEAGVDIIEIHGAHGY 
LITE FLS PLSNKRTDKYGGS FENRTRVLIDI IKAVRAVI PEEMPLFVRISATEWMEYAGE PSWDLEQSTQIAKLLPD 
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LGVDLLDVSSGGNSVAQKIELTPYYQIDLAAKIREAVGDRLIiIGAVGNINTADIARDWDEQGAEKVAEAKQTHDTI 
EWSESHGGKTKADLVLIARQFLREPEFVLRTAHNLGVWQWPHQYHRAVWRKGARI 



5 SEQ ID No 23 

ATGACTATTGTTAATGAAGGAGCCGAAAATGTTGGTTATTTTACACCTGCGCAAAAAATACCAGCTGGAGCGGCGAT 

AGGTGTACCGCaAACA?UU^TTATTTACTCCTCTTAAAATTAGAGGAGTGGAGTTCCATAACAGAATGT^^ 

CGATGTGCACTTATTCCGCTGACCAAGAAGGGCATTTGACAGATTTTCACCTAGTACATCTTGGAGCGATGGGAATG 

10 CGTGGGCCTGGCCTTGTAATGGTAGAAGCGACAGCGGTTTCCCCAGAGGGACGAATTTCACCTAATGATTCAGGATT 
ATGGATGGAGTCGCAAATGAAGCCGTTACGAAGAATTGTTGAATTTGCTCATTCGCAAAATCAAAAAATTGGGATTC 
AATTGGCGCATGCTGGTAGAAAGGCTAGCACCACTGCTCCTTATCGAGGATACACAGTTGCGACTGAAGCTCAAGGT 
GGGTGGGAGAATGATGTTTATGGACCAAATGAAGACAGGTGGGACGAAAACCACGCTCAACCTCATAAGTTAACTGA 
AAAGCAATATGATGAATTAGTGGATAAGTTTGTTGTTGCTGCGAAGCGTGCAGTTGAAATAGGTTTTGATGTAATTG 

15 A2\ATTCATGGCGCTCATGGTTATCTTATATCGTCAACAGTTAGTCCTGCCACTAATGACCGCAATGACAAGTATGGT 
GGGACATTTGAGAAACGTATTTTGTTTCCTATGGAAGTTGTCCATTCTGTTCGTAAAGCAATTCCAGATAGTATGCC 
CTTGTTTTATAGAGTAACGGCTACAGATTGGTTGCCCAAAGGACAAGGATGGGAGATAGAAGATACAGTTGCATTAG 
CAGCGAGGCTTCGCGATGGTGGTGTTGACTTGATAGATGTTAGCTCTGGTGGTAATCACAAGGATCAAAGAATTGAG 
GTGAAGGATTGCTATCAAGTTCCTTTTGCGGAAAAGATTAAGGATCAAGTGAATGGAATACTACTTGGCGCTGTCGG 

20 AATGATCAGGGATGGTCTTACGGCGAATGAAATCCTAGAAAGTGGAAAAGCTGATGTTACTTTTGTCGCAAGGGAGT 
. TCTT AAGGAACCCGTCGTTGGTGCT AGACAGCGCGAACC AGTT GGGT GAAAAT GTTGC ATGGCCAGTTCAGTAT GAC 
TATGCAGTTAAGGGACACAGAAAGTTACGTTGA 

SEQ ID- NO 24 

25 

MTIVNEGAENVGYFTPAQKIPAGAAIGVPQTKLFTPLKIRGVEFHFTNRMFVSPMCTYSADQEGHLTDFHLVHL^ 
GMRGPGLVMVEATAVSPEGRISPNDSGLWFTMESQMKPLRRIVEFAHSQNQKIGIQLAHAGRKASTTAPYRGYTVAT 
EAQGGWENDVYGPFTNEDRWDENHAQPHKLTEKQYDELVDKFVVAAKRAVEIGFDVIEIHGAHGYLISSTVSPAFTT 
NDRNDKYGGTFEBCRILFPMEWHSTOECAIPDSMPLFYRVTATDWLPKGQGWEIEDTVAFTLAARLRDGGVDLIDVSS 
30 GGNHKDQRIEVKDCYQVPFAEKIKDQVNGIIiLGAVGMIRDGLFTTANEILESGKADVTFVAREFLRNPSLVLDSANQ 
LGENVAWPVQYDYAVKGHRKLR 

SEQ ID No 25 

35 CGAAACCTCGACCCAAACAAACAGCTGACCCTCTCCTTGACAACAAAGCCGGCCATCCTCGCCGACGATTGCCTCTA 
CCCCCGCATAGTCACACTCGCACGTCCGTTCTCCCACCGTCAAACAGACAGCATGACGGGCACCGCGAACAAGGCCG 
CCCCCGGTGTGCCGTTTTACACCCCGGCCCAGGAGCCTCCCGCGGGAACGCCAGTCGACGCCAGCACGGCTCCGACG 
CTCTTCAAGCCCCTCCGCATCCGCGACCTCACCATCAACAACCGCATCTGGGTCAGCCCCATGTGCCAGTACT 
CGAC2\ATGGCCACGCGACCGACTACCACCTCGTCCACCTGGGGCAGTTCGCCCTGCACGGCGCCGCCCTGTCCATGG 

40 TCGAGGCCACCGCCGTCGAGGCTCGTGGCCGCATCTCCCCCGAGGATGTCGGTTTGTGGCAGGACTCGCAGATTGCG 
CCGCTGAAGCGCATCGTCGACTTTATCCACTCGCAGAACCAGGTCGCGGCCATCCAGCTCGCCCACGCCGGTCGCAA 
GGCTAGCACCCTGGCACCGTGGATCACCGAGGCTCGCGGCAAGGCGCTGGCTCAGGAGAGCGAGAACGGCTGGCCCG 
ACGACGTTGTGGCTCCCAGCGCGATTCCTTACACCAAGGACTGGGCCACACCGCGTGAGTTGACTACCGAGGRRGTC^ 
GAGGGTCTGGGTGAAGAAGTTCGCCGAGTCGGCCAAGAGGTCA2\ATCGAGCTGGTTTTGACGTCATTGAGATCCACG 

45 CCGCTCA 



SEQ ID No 26 

50 ATGACGGGCACCGCGAACAAGGCCGCCCCCGGTGTGCCGTTTTACACCCCGGCCCAGGAGCCTCCCGCGGGAACGCC 
AGTCGACGCCAGCACGGCTCCGACGCTCTTCAAGCCCCTCCGCATCCGCGACCTCACCATCAACAACCGCATCTGGG 
TCAGCCCCATGTGCCAGTACTCCGCCGACAATGGCCACGCGACCGACTACCACCTCGTCCACCTGGGCCAGTTCGCC 
CTGCACGGCGCCGCCCTGTCCATGGTCGAGGCCACCGCCGTCGAGGCTCGTGGCCGCATCTCCCCCGAGGATGTCGG 
TTTGTGGCAGGACTCGCAGATTGCGCCGCTGAAGCGCATCGTCGACTTTATCCACTCGCAGAACCAGGTCGCGGCCA 

55 TCCAGCTCGCCCACGCCGGTCGCAAGGCTAGCACCCTGGCACCGTGGATCACCGAGGCTCGCGGCAAGGCGCTGGCT 
CAGGAGAGCGAGAACGGCTGGCCCGACGACGTTGTGGCTCCCAGCGCGATTCCTTACACCAAGGACTGGGCCACACC 
GCGTGAGTTGACTACCGAGGRGTCGAGGGTCTGGGTGAAGAAGTTCGCCGAGTCGGCCAAGAGGTCAAATCGAGCTG 
GTTTTGACGTCATTGAGATCCACGCCGCT 

60 SEQ ID No 27 

MTGTANKAAPGVPFYTPAQEPPAGTPVDASTAPTLFKPLRIRDLTINNRIWVSPMCQYSADNGHATDYHLVHLGQFA 
LHGAALSMVEATAVEARGRISPEDVGLWQDSQIAPLKRIVDFIHSQNQVAAIQLAHAGRKASTLAPWITEARGKALA 
QESENGWPDDWAPSAIPYTKDWATPRELTTEXSRVWVKKFAESAKRSNRAGFDVIEIHAA 
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SEQ ID No 28 

GAACTGCTGTAGATGTGGTTGAATTGGTATATTAGACCGGAGTACTCTATATGCGAGAGACTATACATTGAAGTTGC 
CAACGTTCTTCCAGATTGATTAATCATGGCTTACGAGATAATCGACAACGTTGCGGCTGAAGGGGTTCCATATTACA 
5 CACCGGCTCAAGACCCGCCAGCTGGTACGCAGACAAGCGGCTCAACGAAGCTATTCACACCCATCACCATCCGCGGC 
GTCACATTCCCAAACCGCCTCTTCCTTGCCCCTCTCTGCCAATACTCCGCCAAAGATGGTTATGCCACTGATTGGCA 
■ CTTGACTCACCTCGGGGGAATAATCCAAAGAGGCCCCGGATTGTCCATGGTGGAGGCTACCGCTGTACAAAACCACG 
GTCGCATCACACCTCAGGATGTTGGTCTGTGGGAAGACGGCCAGATCGAGCCTCTGAAGCGCATCACCACTTTCGCG 
CACAGTCAGAGCCAGAAAATTGGTATCCAGCTG^'CGCATGCGGGTCGCAAGGCCAGTTGCGTATCTCCCTGGCTAAG 
10 CGTAAATGCTGTCGCGGCGGAAGAAGTGGGTGGCTGGCCAGACAATATCGTTGCTCCCTCGGCCATCGCAGAAGAAA 
ATGGTGTGAACCCAGTTCCCAAGGCTTTCACGAAGGAGGATATAGAGCAACTCAAGAGCGACTACGTGGAAGCGGCA 
AAACGAGCCATCCATGCTGGTTTCGATGTTATCGAAATTCATGCAGCTCATGGATATCTACTGCATCAATTCTTGAG 
TCCGGTAAGCAATCAAAGAACCGACGAGTATGG 

15 SEQ ID No 29 

ATGGCTTACGAGATAATCGACAACGTTGCGGCTGAAGGGGTTCCATATTACACACCGGCTCAAGACCCGCCAGCTGG 
TACGCAGACAAGCGGCTCAACGAAGCTATTCACACCCATCACCATCCGCGGCGTCACATTCCCAAACCGCCTCTTCC 
TTGCCCCTCTCTGCCAATACTCCGCCAAAGATGGTTATGCCACTGATTGGCACTTGACTCACCTCGGGGGAATAATC 

20 CAAAGAGGCCCCGGATTGTCCATGGTGGAGGCTACCGCTGTACAAAACCACGGTCGCATCACACCTCAGGATGTTGG 
TCTGTGGGAAGACGGCCAGATCGAGCCTCTGAAGCGCATCACCACTTTCGCGCACAGTCAGAGCCAGAAAATTGGTA 
TCCAGCTGTCGCATGCGGGTCGCAAGGCCAGTTGCGTATCTCCCTGGCTAAGCGTAAATGCTGTCGCGGCGGAAGAA 
GTGGGTGGCTGGCCAGACAATATCGTTGCTCCCTCGGCCATCGCACAAGAAAATGGTGTGAACCCAGTTCCCAAGGC 
TTTCACGAAGGAGGATATAGAGCAACTCAAGAGCGACTACGTGGAAGCGGCAAAACGAGCCATCCATGCTGGTTTCG 

25 ATGTTATCGAAATTCATGCAGCTCATGGATATCTACTGCATCAATTCTTGAGTCCGGTAAGCAATCAAAGAACCGAC 
GAGTATGG 

SEQ ID No 30 

30 MAYEIIDNVAAEGVPYYTPAQDPPAGTQTSGSTKLFTPITIRGVTFPNRLFIAPLCQYSAKDGYATDWHLTHLGGII 
QRGPGLSMVEATAVQNHGRITPQDVGLWEDGQIEPLKRITTFAHSQSQKIGIQLSHAGRKASCVSPWLSVNAVAAEE 
VGGWPDNIVAPSAIAQENGVNPVPKAFTKEDIEQLKSDYVEAAKRAIHAGFDVIEIHAAHGYLLHQFLSPVSNQRTD 
EY 

35 SEQ ID No 31 

TTTGGATGGTATAATAATAATTCTATTTGTGAAACATACGGGGCTGGTCTTGATCAAGAACGGTCCATCTATGGTCT 
ATAAAGAACTCTTGTTCACTTTCTTTCCACGTCCCTTGAAGCTCCAATCAATCTGGTTCGCCATCTTGACCTCCACG 
CCAAGCTCGTTAGCAAAAGCTCGAACCAGACCAGGATTCTGTTGGAACCAACGTCCAGCCCTCACAATGTCGATACC 

40 AGATTGCAAAACCTCTTCAGCAAGATGTCCAGTCTTGATTCCACCTACTGCTGAAACAAGTACACTATCGCCTIAC^ 
CCTTCTTTACCTGTTTGGCGAGGTCTACCTGGTAAGCAGGACCGGACTTGATGGCGATGGCGGACTTAGGATGGATA 
CCGCCTGAGCTGACGTCCACCAAGTCTACTCCATGCTTGGGCAAGATACGCGCGAGTTGACAAGTCTGCTCGACTGT 
CCAGCTTTCAGGAAACTCGTCTTTGAATTGAGAGTCAAACTCGAACCAATCAGTTGCACTGACACGAACGAGGACAG 
GTGTAGTTTCGGGGATGGCAGCGCGGATGAGGTCAAGGATTTCCAAGACAACTCTGATACGGTTCTCGAAACTGCCA 

45 CCATACTCGTCGGTT 

SEQ ID No 32 

AACCGACGAGTATGGTGGCAGTTTCGAGAACCGTATCAGAGTTGTCTTGGAAATCCTTGACCTCATCCGCGCTGCCA 
50 TCCCCGAAACTACACCTGTCCTCGTTCGTGTCAGTGCAACTGATTGGTTCGAGTTTGACTCTCAATTCAAAGACGAG 
TTTCCTGAAAGCTGGACAGTCGAGCAGACTTGTCAACTCGCGCGTATCTTGCCCAAGCATGGAGTAGACTTGGTGGA 
CGTCAGCTCAGGCGGTATCCATCCTAAGTCCGCCATCGCCATCAAGTCCGGTCCTGCTTACCAGGTAGACCTCGCCA 
AACAGGTAAAGAAGGCTGTTGGCGATAGTGTACTTGTTTCAGCAGTAGGTGGAATCAAGACTGGACATCTTGCTGAA 
GAGGTTTTGCAATCTGGTATCGACATTGTGAGGGCTGGACGTTGGTTCCAACAGAATCCTGGTCTGGTTCGAGCTTT 
55 TGCTAACGAGCTTGGCGTGGAGGTCAAGATGGCGAACCAGATTGATTGGAGCTTCAAGGGACGTGGAAAGAAAGTGA 
ACAAGAGTTCTTTATAG 

SEQ ID No 33 

60 TDEYGGSFENRIRWLEILDLIRAAIPETTPVLVRVSATDWFEFDSQFKDEFPESWTVEQTCQLARILPKHGVDLVD 
VSSGGIHPKSAIAIKSGPAYQVDLAKQVKKAVGDSVLVSAVGGIKTGHLAEEVLQSGIDIVRAGRWFQQNPGLVRAF 
ANELGVEVKMANQI DWS FKGRGKKVNKS SL 

SEQ ID No 34 
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AGGAAGTTGCATGTCACTTGTAGTGACAGGGCGTCGTGTAAATTTTATAAATACCTATACTTGTTTGTTCACTTCTA 
TGCT ACT CAT ATCAAT CCGAGAAGATCAAACAGT CCCCTAT ACACACTTGTCAAGACCTATCTATT ATTT CAAAAAT 
CAGCAATATGGCTGAGACAATGCCTAAGTGTGAGGCAAATGGCCATCACAAAATCATCATCAATAAGGAAGCTCCGA 
ATGTTCCTTTCTATACTCCAGTGCAAGATCCACCAGCAGGAACGTCTTACGATGTTCAGCCTGAAGGAAGCCTATTC 
5 TCTCTTATTAAAATAAGAAACCTGACTCTTCAAAACCGGATTTTTGTCTCCCCAATGTGTCAATATTCAGCAAAGGA 
TGGTGTCATGACCCCCTGGCACAi^CAACACCTGGGCAGCTTCGCAGCACGAGGTCCGGGTCTCATTGTCACAGAAG 
TCAACGCAGTTTCACCAGAGGGACGAATCAGTCCTGAGGATGCAGGCATCTACGATGATGGGCAGCTTGGACCTCTC 
CGGGATATTGTGGACTTTGTACACAGCCAGGGCGCCAAGATTGCTATTCAGATAGGTCATGCTGGGAGAAAAGCGAG 
CACAGTCGTACCGTGGCTGGACCGCAAGAACACTGCTTTTA 

10 

SEQ ID No 35 

MPKCEANGHHKIIINKEAPNVPFYTPVQDPPAGTSYDVQPEGSLFSLIKIRNLTLQNRIFVSPMCQYSAKDGVMTPW 
15 . HKQHLGSFAARGPGLIVTBVNAVSPEGRISPEDAGIYDDGQLGPLRDIVDFVHSQGAKIAIQIGHAGRKASTWPWL 
DRKNTAF 

SEQ ID No 36 • 

20 GCACGAGGGATTATTGACAACATCGCGGCTGAAGGGGCTCCCTACTACACGCCTGCTCAAGACyCTCCAGCAGGCAC 
ACAGACCAGCGGCTCAACCAAGGTTTTCACACBCATCACCATCCGAGGCGTCACATTCCCAAACCGTCTCTTTCTTG 
CCCCTCTCTGTCAATACTCCGCCAAAGATGGATATGCTACTGATTGGCACTTGACTCATCTCGGAGGCATTATCCAA 
CGAGGCCCGGGACTGTCCATGGTAGAGGCCACCGCTGTTCAAAACCACGGTCGCATCACGCCTCAGGACGTTGGTCT 
CTGGGAAGATGGACAAATCGAGCCCTTTGAAGCGCATCACTACTTTTGCCCACAGCCAAAGCWCAGAAGATTGGTAT 

25 TCAGCTCTCGCACGCTGGTCGTAAGGCTAGTTGTGTATCTCCGTGGTTGAGCATCAACGCTGTTGCCGCTAAGGAAG 
TCGGTGGCTGGCCAGAC2\ACATTGTTGCTCCTTCTGCCATCGCACAAGAAGCTGGCGTGAACCCTGTTCCCAAGGCC 
TTCACCAAGGAGGATATCGAGG2VACTCAAGAATGACTTTCTGGCTGCAGCMAAACGAGCCAWCCGCGCTGGTTTTGA 
TGTCATCGAGATCCATGCAGCTCATGGATACKTGCTTCACCAGTTCTTGAGTCCAGTCAGTAACCAAAGAACCGATG 
AGTATGGTGGCAGCTTCGAGAACCGTATCAGAGTCGTCTTGGAGATCATTG 

30 

SEQ ID No 37 

GCACGAGGGATTATTGACAACATCGCGGCTGAAGGGGCTCCCTACTACACGCCTGCTCAAGACYCTCCAGCAGGCAC 
ACAGACCAGCGGCTCAACCAAGGTTTTCACACBCATCACCATCCGAGGCGTCACATTCCCAAACCGTCTCTTTCTTG 
CCCCTCTCTGTCAATACTCCGCCAAAGATGGATATGCTACTGATTGGCACTTGACTCATCTCGGAGGCATTATCCAA 

35 CGAGGCCCGGGACTGTCCATGGTAGAGGCCACCGCTGTTCAAAACCACGGTCGCATCACGCCTCAGGACGTTGGTCT 
CTGGGAAGATGGACAAATCGAGCCCTTGAAGCGCATCACTACTTTTGCCCACAGCCAAAGCCAGAAGATTGGTATTC 
AGCTCTCGCACGCTGGTCGTAAGGCTAGTTGTGTATCTCCGTGGTTGAGCATCAACGCTGTTGCCGCTAAGGAAGTC 
GGTGGCTGGCCAGACAACATTGTTGCTCCTTCTGCCATCGCACAAGAAGCTGGCGTGAACCCTGTTCCCAAGGCCTT 
CACCAAGGAGGATATCGAGGAACTCAAGAATGACTTTCT<3GCTGCAGCMA2U^CGAGCCAWCCGCGCTGGTTTTGATG 

40 TCATCGAGATCCATGCAGCTCATGGATACKTGCTTCACCAGTTCTTGAGTCCAGTCAGTAACCAAAGAACCGATGAG 
TATGGTGGCAGCTTCGAGAACCGTATCAGAGTCGTCTTGGAGATCATTG 

SEQ ID No 38 

ARGIIDNIAAEGAPYYTPAQDXPAGTQTSG5TKVFTXITIRGVTFPNRLFLAPLCQYSAKDGYATDWHLTHLGGIIQ 
45 RGPGLSMVEATAVQNHGRITPQDVGLWEDGQIEPLKRITTFAHSQSQKIGIQLSHAGRKASCVSPWLSINAVAAKEV 
GGWPDNIVAPSAIAQEAGTOPVPKAFTKEDIEELKNDFLAAXKRAXRAGFDVIEIHAAHGYXLHQFLSPVSNQRTDE 
YGGS FBNRIRWLE I X 

SEQ ID No 39 

50 CCTCAAGATCCGAGGTCTTACCCTCCAGAACCGTATTATGTTGAGGGGGCTCTGCCAGTACTCTGCTCCCGACGGAC 
ACTACACAATGTGGCATCACACCCACATGGGCGGCATCATCCAACGCGGTCCCGGACTCACCTGCGTTGAAGCCACA 
GCCGTGACTCCTCAAGGTCGCATCACGCCTGAAGACGTCGGTATCTGGCAAGATTCTCAGATCGAGCCTCTTGCCAA 
GGTCGTCGAGTTTGCCCACTCCCAGAACCAGAAGATCATGATTCAGTTGGCGCATGCGGGCCGCAAAGCGAGCACTG 
TGGCACCATGGTTAAGCGGCGGCGATGTTGCTGGTGAGGACGTCAACGGATGGCCACAGGATGTCTGGGCGCCCAGT 

55 GCGATTCCATGGAACGAGAAGCACGCTGTCCCAAAGGAGATGTCGTTGGATGATATCGAGGCTTTCAAGAAGGCGTT 
TGGAGAGGCGGTCAAGCGGGCATTGAAGGCTGGATTTGATGTTATTGAGATTCACAATGCTCACGGATACCTCCTCC 
ACGAATTCATCTGCCTGAGAGCAACACCAGGACCGACAAGTACGGGCGGAAGCTGGGAAAACCGCACTCGTCTGACA 
ATGGAAAGTCGTCGACCTTGTCCGCAGCATT 

60 

SEQ ID No 

40LKIRGLTLQNRIMLRGLCQYSAPDGHYTMWHHTHMGGIIQRGPGLTCVEATAVTPQGRITPEDVGIWQDSQIEPL 
AKVVEFAHSQNQKIMIQLAHAGRKASTVAPWLSGGDVAGEDTOGWPQDWAPSAIPWNBKHAVPKEMSLDDIEAFKK 
AFGEAVKRALKAGFDVIEIHNAHGYLLHEFICLRATPGPTSTGGSWENRTRLTMESRRPCPQH 

65 
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SEQ ID No 41 

GACTGCCGAGTAAACGCGCCGGCAAGGAGGCGGGAGGATGGCCGGAGGATGTTGTGGGTCCGTCGGGTGGGGAGGAC 
TTTACGTGGGATGAGAGGTCCTCGAGCGACCCTAGTGGAGGCTACTATGCGCCGAGAGAGTTGTCGGTCAGAGAGAT 
5 CAAGGAGATGGTCCAAGACTGGGCGACAGCAGCGAAAAGGGCGGTGAAAGCGGGCGTGGATGTAATCGAAATCCACG 
GCGCGCATGGGTACCTCATCCACGAATTCCTCTCACCCATTACCAACCGCCGGACAGATTCTTACGGCGGTTCTTTC 
GAAAACCGTACCCGTCTACTCATTGAAATCGTAACAGCCGTCCGAGCCGCGATGCCCTCCAGCATGCCTCTCTTCCT 
CCGCCTCTCCTCTACAGAATGGATGGAAGATACCGACATCGGCAAGAAGTTCGGAAGCTGGGATGTCGAAAGCACGA 
TCAAGATCTCCAAAATCCTGGCCGACTTGGGCGTTGATCTCCTCGACGTGTCTTCCGGTGGGAATCATCCTCAGCAG 
10 AAAATCAACATGTTCAACACC 



SEQ ID No 42 

LPSKRAGKEAGGWPEDWGPSGGEDFTWDERSSSDPSGGYYAPRELSVREIKEMVQDWATAAKRAVKAGVDVIEIHG 
15 AHGYLIHEFLSPITNRRTDSYGGSFEmTRLLIEIVTAVRAAMPSSMPLFLRLSSTEWMEDTDIGKKFGSWDVESTI 
KISKILADLGVDLLDVSSGGNHPQQKIKMFNT 

SEQ ID No. 

ATGTCCCCACCACGCTTCGAAGCGGCCCCTGCCGACCCCTCACCGCTCGGCACGCCGCTCAAATACCCCGTCTCGGG 

20 GCGGTCGGCGCCCAACCGGTTCCTCAACGCGGCCATGTCGGAGGGCCTGGCGACGTTTGACGAGGCGGACCCGTCCA 
AGCGCGGCATCCCGACGGAGCAGCTGGTGCAGCTGTACCGGCGCTGGGGCCAGGGCGAGTGGGGCCAGATCCAGACG 
GGCAACGTCATGATCGACCCGGAGCACCTCGAGGCCCCGGGCAACATGGTGGTGCCGCGCGACGCCGAGCCCTCGGG 
CGAGCGCTTCGACATGTTXTCCAAGCTCGCCGCCGCCGCCAAGGAGCACGGCAGCCTCATCGTCGCGCAGGTCGGAC 
ACCCCGGTCGCCAGGCCCGCGGCAGCGTCCAGCAGCACCCCATTAGCGCCAGCGACGTGCAGCTTAAGCAGGAGATG 

25 TTTGGGTCAAAGTTTGGCGTGCCCAGGCCCGCTACCAAGGAGGATATTAAGGCGGTGATTGAGGGTTTTGCCCACAC 
GGCCGAGTACCTTGAAAAGGCCGGTTTCGACGGTATCGAATTGCACGCCGCCCACGGTTACCTGCTGGCCCAATTCC 
TGTCCGAAACAACCl^CCAGCGCACCGACGAGTACGGCGGCAGCCTCGAAAACCGCATGCGGCTAATCCTCGAGGTC 
ACGGCCGAGGTCCGCAGGCGGACGAGCAAGAATTTCATCCTCGGCATCAAAATTAACAGCGTCGAGTTCCAGGAGAA 
GGGTTTCAAGCCAGAGGAGGCGGTGCAGTTGTGCGAGGCCCTCGAGGCCGCGGGCATGGATTTTGTCGAGACGAGCG 

30 GCGGCACCTATGAGAGTTTTGGTTTTGCGCACCGCAAGGAGTCCAGCCGCAAGCGGGAGJy^CTATTTTATCGAGTTC 
GCCGAGGTCATCCGCAAGGCCGTCAAGCACATGGTGGTCTACACCACCGGCGGCTTCAAGACGGTGGGCGCCATGGT 
CGACGCGCTGCAGGGCGTCGATGGGATAGGCATCGGGCGCGCAGCCGGfTTCGGAGCCGGACCTCGCCAAGGACATCA 
TCGCGGGCAAGGTGTCCAGCATTATCAAATACGCCATGGGGGAGGACGAGTTTGTGCTGCAGTTGACTGCCTGCTCG 
GCGCAAATAAGGCTGATGGCCAAGGGCGAGGAGCCGTTTGACATCTCAAACGCCGACGAGGTGGCGCGGGTGACGCA 

3 5 GTTGATGGCGGAGGGCAAGGTG 



SEQ ID No- 44 

MS PPRFEAAPADPS PLGT PLKYPVSGRS APNRFLNAAMSEGLAT FDEADPSKRGI PTEQLVQLYRRWGQGEWGQIQT 
40 GNVMIDPEHLEAPGNMWPRDAEPSGERFDMFSKIJU?l2\AKEHGSLIVA 

fgskfgvprpatkedikaviegfahtaeylekagfdgielhaahgyliaqflsettnqrtdeyggslenriyir^ 

taevrrrtsknfilgikinsvefqekgfkpeeavqlcealeaagmdfvetsggtyesfgfahrkessrkrenyfief 

aevirkavkhmvvyttggfktvgamvdalqgvdgigigraagsepdiiakdiiagkvssiikyamgedef\^ 

AQIRLMAKGEEPFDISNADEVARVTQLMAEGKV 

45 

SEQ ID No. 45 

AGCTTAGACCTACAGAGAGCATTGCTACTGTAAGTTGTATTTCGCCTTCTCGCATAGAACAAAATATAACTGATGGT 
GTAGGTATAAAACTAGCATCCTCTTCCACCTTTCAGATCCCCCTGACAAGCACCTTATGGCTTTCGATGGAAACAGC 
TATTCCTTCTACTGGTAAAAATAGGATACCAGAGGCTACAATCAATACACCCTCGATAGAGGCTGTCGAATGTGGCC 

50 AACTGGCAACGCTGCGGTTAGTCATCGTCGGAGACTTTCTGGGATTCATTTTCTTCCGAGTCTCCGCCTGCTTATTA 
AGGCATCAATCTGGATGCTCCACTGTGGTACATCCAATTTTCGATTTTTCTTCGGCAGAGGCAAGGATTCCACTGGT 
TCAGTCTAGGCATTTAGAAGATCAAAGCTGTCCTGTACCTCCGTACCTGGGTGTTCGACGTCATTGCCACGTTTCGA 
CCCAAGGGCAGACGCCATGTCGCCGAGCGATCGCCGCGATATGCCTCGAATTTGCGCCATTCGGCATCCAGTTTCCA 
GTGCCCTTCCCCGAATGACTGTCTCCACTATTCGGCAAGATTGTAAATCAAGCCTGAAGAAGCGGAGCATTCTTGGA 

5 5 AGTCGTATGTTCTACTGATTCTGT GCCTGGCGCAGACGGGTATATAATAAAGATCACGCACCGAGGAGTTCTTA 

SEQ ID No. 46 
GTTCGACGTCATTGCCACG 



60 SEQ ID No. 47 

CCTTGATCGTTGCTGAGCG 
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SEQ ID NO. 48 
ATGACTGTCGCCGATATCG 

SEQ ID No. 49 
5 CTATACATCGAAAATAGACTGC 

SEQ ID No. 50 

CCGTCCTGGGCGGAGTATTGGCAGAG 

10 SEQ ID No. 51 

GCGAATCAGATTCTAGAGGAGCAGGATATCG 

SEQ ID No. 52 

GCTCAGCACCTCGGCGTCGAAATCTCC 

SEQ ID No. 53 
TCTGCCAATACTCCGCC 

SEQ ID No. 54 
20 CTTTCCGGCCGGCATG 

SEQ ID No. 55 

GGTATTGAGGGTCGCATGACTGTCGCCGATATCGA 

25 SEQ ID No. 56 

AGAGGAGAGTTAGAGCCTACATCGAAAATAGACTGCTTGTACACC 

SEQ ID No. 57 

GGTATTGAGGGTCGCATGTCGCAACCTGTTGTG 



30 



SEQ ID No. 58 

AGAG6AGAGTTAGAGCCTATATCTTCTCGAGTTTCTTCC 



SEQ ID No. 59 
3 5 GGTATTGAGGGTCGCATGGGTTCCAACGCCTTC 

SEQ ID No. 60 

AGAGGAGAGTTAGAGCCTAAATGGCCCTGCCAAACTG 

40 SEQ ID No. 61 

GGTATTGAGGGTCGCATGGCTCTCCCTGACGTCGAAA 



SEQ ID No, 62 

AGAGGAGAGTTAGAGCCTACTCAAAGATGCTCTCC 
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SEQ ID No. 63 

GGTATTGAGGGTCGCATGACAGTTCCATACCAAG 



5 SEQ ID No. 64 

AGAGGAGAGTTAGAGCCTAATTTACTTCTAATTTAGATGTTC 



SEQ ID No. 65 

GGTATTGAGGGTCGCATGTCGGCAGAAAAGAAG 

10 

SEQ ID No. 66 

AGAGGAGAGTTAGAGCCCAAATCCTTGCACCCTTGCGCC 

SEQ ID No- 67 
15 CAGACCAATGGCCAGAAGA 

SEQ ID No. 68 

AGAT GGGCGAT GT GGTAGTC 

20 SEQ ID No. 69 

gccgcttacagggaatgata 

SEQ ID No. 70 
atggctcaatctgcgagtct 

25 

SEQ ID No. 71 
CGACTCTTGTGGGTGCTGTA 

SEQ ID No. 72 
3 0 GTGGAAAACACCCATTCTGG 

SEQ ID No. 73 
CCCCAATCGTCAGATGAAGT 

35 SEQ ID No. 74 

CTGGCCCACGATTCACTAAT 

SEQ ID No. 75 
caaaagatcgccatccaact 

40 

SEQ ID No- 76 
Ctggtgacgagtccctcaat 

SEQ ID No. 77 
45 ccagcagatgttcgaccccaag 

SEQ ID No i 78 
cagtgaactccatctcgtccatac 

50 SEQ ID No, 79 

TCCGTGGCGTCACCTTCC 

SEQ ID No. 80 
CAGATGGGCGATGTGGTAGTC 

55 

SEQ ID No 81 

tcgc'gcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 
cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120 
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ttggcgggtg 


tcggggctgg 


cttaactatg 


cggcatcaga 


gcagattgta 


ctgagagtgc 


180 


5 


accatatgcg 


gtgtgaaata 


ccgcacagat 


gcgtaaggag 


aaaataccgc 


atcaggcgcc 


240 




attcgccatt 


caggctgcgc 


aactgttggg 


aagggcgatc 


ggtgcgggcc 


tcttcgctat 


300 




tacgacagct 


gtctcttata 


cacatctcaa 


ccatcatcga 


tgaattttct 


cgggtgttct 


360 


10 


cgcatattgg 


ctcgaattcg 


agctcggtac 


ccggggatcc 


tctagaagtc 


ctgaatagta 


420 




gtttgtggat 


taacattgtt 


ccgatgtagg 


aatcatgatc 


ccaaccagaa" 


gagctggaca 


480 


15 


gcccctcttc 


cagagcattt 


ttggtgggat 


gttttggctt 


agtgcgatgc 


aactggacaa 


540 


agtccttccg 


tttctactgc 


gtcttacatc 


atctggtatc 


tacgcaagcc 


gcccacttac 


600 




catatgaata agaggcactc aggttttccc 


tcaccccccc 


gaagcgatgg 


taagcgggtg 


660 


20 


ccaaatgcat 


cgggagtttc 


tctatcatast 


taacctaggt 


attccgtaat 


ctattaccag 


720 




tctttccgaa 


gagctggtag 


caactgcacg 


agatttgtag 


gagcgagtac 


cc^gctggac 


780 


25 


gagcacgcag 


cacggctatt 


ggtcagcatg 


gtagctaccg 


aggggaggca 


ggccgcccaa 


840 


atatcgtgag 


tctcctgctt 


tgcccggtgt 


atgaaaccgg 


aaaagctgct 


atagagcttc 


900 




tgggcggcgc atgtcgggaa accagcagca 


agctgaccca 


gaaagacccg 


tcctcaagcc 


960 


30 


attaccgtac 


taatcaatta 


tttgtgtagc 


aacactggga 


agctgtagtg cataggctgg 


1020 




agcagctatt 


tggcctttag 


ccccgtctgt 


ccgcccggtg 


tgcggtttcg 


actggcgcgc 


1080 


35 


aagctcaagg 


tgatcaggtc 


gttgcgtcag 


tcggagacaa 


caagccattg 


ccttttctac 


1140 


tgcccctccc 


ccgctggtgg 


cctttttctc 


tcatcttctc 


ctctcttccc 


atcatcagca 


1200 




tcattaatct 


actgtctctc 


tttctttcta 


tcattctata 


aagtaagaac 


at ate cat ct 


1260 


40 


tccctcaatc 


ccgtctacaa 


tagtgtcctc 


ttcactactc 


tgtctctatc tctcaaagct 


1320 




tgactgacat 


ttaccccgct 


cagtaccaga 


cgaatctaca 


cagaattcga 


gctcactaaa 


1380 


45 


ccatggccaa 


gttgaccagt 


gccgttccgg 


tgctcaccgc 


gcgcgacgtc 


gccggagcgg 


1440 


tcgagttctg 


gaccgaccgg 


ctcgggttct 


cccgggactt 


cgtggaggac 


gacttcgccg 


1500 




gtgtggtccg 


ggacgacgtg 


accctgttca 


tcagcgcggt 


ccaggaccag gtggtgccgg 


1560 


50 


acaacaccct 


ggcctgggtg 


tgggtgcgcg 


gcctggacga 


gctgtacgcc 


gagtggtcgg 


1620 




aggtcgtgtc cacgaacttc cgggacgcct ccgggccggc catgaccgag atcggcgagc 


1680 


55 


agccgtgggg 


gcgggagttc 


gccctgcgcg 


acccggccgg 


caactgcgtg 


cacttcgtgg 


1740 


ccgaggagca 


ggactgagaa 


ttccactagt 


gcagaaagct 


gttttccttg 


ctctgtggta 


1800 




taagtctagt 


gccactattc 


tatgatgagt 


tgatgactct 


ttcatgactg gaaggcttac 


1860 


60 


attctccaag atcatgtctc actcaaaact tatctcgggt tcactttcgg gttccatata 


1920 




tctcatcatt 


tctgggttta 


gaaacatctc 


tctcgttttt 


gcagctcttc 


tacgtactcc 


1980 




tagcggtttc 


actgaaatga 


atacatttgg 


gtaacctaat 


tgccaattca 


tatcttcctg 


2040 
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agggcagtaa cacatcacgt acattctatc 
cttttatgct tcctcctttc ttaccattta 
5 ggcccctgat tgtattgtca cctcaccaaa 
cttttatgga cagcaagcga accggaattg 
aagccctgca aagtaaactg gatggctttc 

10 

tcaagctctg atcaagagac aggatgagga 
cacgcaggtt ctccggccgc ttgggtggag 
15 acaatcggct gctctgatgc cgccgtgttc 
tttgtcaaga ccgacctgtc cggtgccctg 
tcgtggctgg ccacgacggg cgttccttgc 

20 

ggaagggact ggctgctatt gggcgaagtg 
gctcctgccg agaaagtatc catcatggct 
25 ccggctacct gcccattcga ccaccaagcg 
atggaagccg gtcttgtcga tcaggatgat 
gccgaactgt tcgccaggct caaggcgagc 

30 

catggcgatg cctgcttgcc gaatatcatg 

t 

gactgtggcc ggctgggtgt ggcggaccgc 
35 attgctgaag agcttggcgg cgaatgggct 
gctcccgatt cgcagcgcat cgccttctat 
aacgcttaca atttcctgat gcggtatttt 

40 

gcctgcaggt cgacctgcag gcatgcaagc 
gcttcagggt tgagatgtgt ataagagaca 
45 gagaggcggt ttgcgtattg ggcgctcttc 
ggtcgttcgg ctgcggcgag cggtatcagc 
agaatcaggg gataacgcag gaaagaacat 

50 

ccgtaaaaag gccgcgttgc tggcgttttt 
caaaaatcga cgctcaagtc agaggtggcg 
55 gtttccccct ggaagctccc tcgtgcgctc 
cctgtccgcc tttctccctt cgggaagcgt 
tctcagttcg gtgtaggtcg ttcgctccaa 

60 

gcccgaccgc tgcgccttat ccggtaacta 
cttatcgcca ctggcagcag ccactggtaa 
65 tgctacagag ttcttgaagt ggtggcctaa 



agctgtgata 


gagttacaaa 


actagcaata- 


2100 


cacatccgct 


ttctctctgc 


tcttgatctt 


2160 


ttcaagtcat 


cacctcttct 


ctagagtcga 


2220 


ccagctgggg 


cgccctctgg 


taaggttggg 


2280 


tcgccgccaa 


ggatctgatg 


gcgcagggga 


2340 


tcgtttcgca 


tgattgaaca 


agatggattg 


2400 


aggctattcg 


gctatgactg 


ggcacaacag 


2460 


cggctgtcag 


cgcaggggcg 


cccggttctt 


2520 


aatgaactgc 


aagacgaggc 


agcgcggcta 


2580 


gcagctgtgc 


tcgacgttgt 


cactgaagcg 


2640 


ccggggcagg 


atctcctgtc 


atctcacctt 


2700 


gatgcaatgc 


ggcggctgca 


tacgcttgat 


2760 


aaacat:cgca 


tcgagcgagc 


acgtactcgg 


2820 


Gtggacgaag 


agcatcaggg 


gctcgcgcca 


2880 


atgcccgacg 


gcgaggatct 


cgtcgtgacc 


2940 


gtggaaaatg 


gccgcttttc 


tggattcatc 


3000 


tatcaggaca 


tagcgttggc 


tacccgtgat 


3060 


gaccgcttcc 


tcgtgcttta 


cggtatcgcc 


3120 


cgccttcttg 


acgagttctt 


ctgaattatt 


3180 


ctcgcatgca 


tcactagtga 


attcgcggcc 


3240 


ttgccaacga 


ctacgcacta 


gccaacaaga 


3300 


gctgtcttaa 


tgaatcggcc 


aacgcgcggg 


3360 


cgcttcctcg 


ctcactgact 


cgctgcgctc 


3420 


tcactcaaag 


gcggtaatac 


ggttatccac 


3480 


gtgagcaaaa 


ggccagcaaa 


aggccaggaa 


3540 


ccataggctc 


cgcccccctg 


acgagcatca 


3600 


aaacccgaca 


ggactataaa 


gataccaggc 


3660 


tcctgttccg 


accctgccgc 


ttaccggata 


3720 


ggcgctttct 


catagctcac 


gctgtaggta 


3780 


gctgggctgt 


gtgcacgaac 


cccccgttca 


3840 


tcgtcttgag 


tccaacccgg 


taagacacga 


3900 


caggattagc 


agagcgaggt 


atgtaggcgg 


3960 


ctacggctac 


actagaagga 


cagtatttgg 


4020 
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10 



15 



20 



25 



30 



35 



40 



45 



tatctgcgct 


ctgctgaagc 


cagttacctt cggaaaaaga 


gttggtagct 


cttgatccgg 


4080 


caaacaaacc 


accgctggta gcggtggttt ttttgtttgc aagcagcaga 


ttacgcgcag 


4140 


aaaaaaagga 


tctcaagaag 


atcctttgat cttttctacg 


gggtctgacg 


ctcagtggaa 


4200 


cgaaaactca 


cgttaaggga 


ttttggtcat gagattatca 


aaaaggatct 


tcacctagat 


4260 


ccttttaaat 


taaaaatgaa gttttaaatc aatctaaagt 


atatatgagt 


aaacttggtc 


4320 


tgacagttac 


caatgcttaa 


tcagtgaggc acctatctca 


gcgatctgtc 


tatttcgttc 


4380 


atccatagtt 


gcctgactcc 


ccgtcgtgta gataactacg 


atacgggagg 


gcttaccatc 


4440 


tggccccagt gctgcaatga taccgcgaga cccacgctca 


ccggctccag 


atttatcagc 


4500 


aataaaccag 


ccagccggaa 


gggccgagcg cagaagtggt 


cctgcaactt 


tatccgcctc 


4560 


catccagtct 


attaattgtt 


gccgggaagc tagagtaagt 


agttcgccag 


ttaatagttt 


4620 


gcgcaacgtt 


gttgccattg 


ctacaggcat cgtggtgtca 


cgctcgtcgt 


ttggtatggc 


4680 


ttcattcagc 


tccggttccc 


aacgatcaag gcgagttaca 


tgatccccca 


tgttgtgcaa 


4740 


aaaagcggtt 


agctccttcg 


gtcctccgat cgttgtcaga 


agtaagttgg 


ccgcagtgtt 


4800 


atcactcatg 


gttatggcag 


cactgcataa ttctcttact 


gtcatgccat 


ccgtaagatg 


4860 


cttttctgtg actggtgagt actcaaccaa gtcattctga gaatagtgta tgcggcgacc 


4920 


gagttgctct 


tgcccggcgt 


caatacggga taataccgcg 


ccacatagca 


gaactttaaa 


4980 


agtgctcatc 


attggaaaac 


gttcttcggg gcgaaaactc 


tcaaggatct 


taccgctgtt 


5040 


gagatccagt 


tcgatgtaac 


ccactcgtgc acccaactga 


tcttcagcat 


cttttacttt 


5100 


caccagcgtt 


tctgggtgag 


caaaaacagg aaggcaaaat 


gccgcaaaaa 


agggaataag 


5160 


ggcgacacgg 


aaatgttgaa 


tactcatact cttccttttt 


caatattatt 


gaagcattta 


5220 


tcagggttat 


tgtctcatga 


gcggatacat atttgaatgt 


atttagaaaa 


ataaacaaat 


5280 


aggggttccg 


cgcacatttc 


cccgaaaagt gfccacctgac 


gtctaagaaa 


ccattattat 


5340 


catgacatta 


acctataaaa 


ataggcgtat cacgaggccc 


tttcgtc 




5387 



SEQ ID No. 82 

50 ATGACAGTTCAATCACAGCAACAATCCCAGGCTATTCCGGTCCTTTCTTCCCAGAATGGCACTGAACCCCAAGACGC 
AAACAAGGAGGTTGTTCAGAATGTCGCTGCCAAAGGAGTGCAATACTTCAACCCTGAGCAACTTCCTGCACCAGGTC 
TCGGTATAAACGGTCCCAATAATACTCTACCAAAGGTCTTTACACCCATCAAGATTCGCGGCATGACCATGCCCAAC 
CGTATCTGGGTCAGCCCCATGTGCCAATACAGTGCCCGTGACGGCTTTCAGCAGCCTTGGCACTTTGCCCACTACGG 
CGGACTGGCCCAACGTGGCCCTGGCCTCATCATGCTAGAAGCTACCGCAGTTCAAGCACGTGGCCGTATCACACCTG 

55 AAGATTCTGGCATCTGGCTAGACTCTCATGTTGAGGGACTGCGAAAGCACGTCGAGTTTGCCCATGCCAACAACTCT 
CTTATCGGTATCCAGATTGGCCATGCTGGTCGCAAGGCCTCCTGCGTTGCTCCTTGGTTAGACGCCGGACTTGCCGC 
TGAAAAGGCCGCTGGTGGATGGCCCGATGACGTTGTCGGACCTAGCAACGAGCCTTTTGCTCCTGGCTACCCTACCC 
CCCGTGCTATTACTCTTGAAGAGATTGAACAGTTGAAGGAGGACTTTGTTTCCGGTGTTCGTCGAGCGGTTGAAGCA 
GGATTTGACACTATCGACTTCCATTTCGCTCACGGTTATCTTGTTTCCAGCTTCCTGTCCCCTGCCACCAACAAGCG 

60 TACCGACAAGTACGGAGGTAGCTTCGAGAACAGAGTGCGCCTTGCTCTCGAGATTGTCGAGGCTGCACGAGCTGTTA 
TGCCTGAGGACATGCCCTTGTTCACTCGCATCAGTGGAACTGACTGGCTGGAGAACAACCCTGAGTACGAGGGAGAG 
ACCTGGACTCTTGAGCAGAGCATCAAGCTTGCACACCAGTTAGCAGACCGTGGTGTCGATGTTTTGGATGTTTCCAG 
TGGTGGCATCCACAAGATGCAAAAGGTCGCTGCTGGTCCCGGTTACCAGGCACCTCTTGCCAAGGCGATCAAGAAGT 
CAGTTGGAGACAAGATGTTGATCAGCACTGTTGGTAGCATCAAGATAGGTACCCTTGCGGAGGAGATCATCGCTGGA 

65 GGAGAGGACGATACCCCCTTGGATCTTGTGGCTTCAGGCCGTCTGTTCCAGAAGAACACTGGACTTGTTTGGTCATG 
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GGCTGACGATCTGAACACTTCTATCCAGATCGCTCATCAGATCGCATGGGGTTTCGGTGGCAGAGCTAAGAAGAACG 
CTCCCAAGCTTGTCTTA 

SEQ ID No. 83 
5 FG00074.1 hypothetical protein 3813139459+ 

MTVQSQQQSQAIPVLSSQNGTEPQDANKEWQNVAZ^GVQYFNPEQLPAPGLGINGPNNTLPKVFTPIKIRGMTMPN 
RIWVSPMCQYSARDGFQQPWHFAHYGGIAQRGPGLIMLEATAVQARGRITPEDSGIWLDSHVEGLRKHVEFAHANNS 
LIGIQIGHAGRKASCVAPWLDAGLAAEKAAGGWPDDWGPSNEPFAPGYPTPRAITLEEIEQLKEDFVSGVRRAVEA 
GFDTIDFHFAHGYLVSSFLSPATNKRTDKYGGSFENRVRLALEIVEAARAVMPEDMPLFTRISGTDWLENNPEYEGE 
10 TWTLEQSIKLAHQIJUDRGVDVLDVSSGGIHKMQKVAAGPGYQAPLAKAIKKSVGDKMLISTVGSIKIGTLAEEIIAG 
GEDDTPLDLVASGRLFQKOTGLVWSWADDLOTSIQIAHQIAWGFGGRAKKNAPKLVL 

SEQ ID No. 84 

ATGGACACGa?CTCGATTCGTGTCTGGTCTCACACCGCCTCTCGTCGACTCGAa:CGATGCACTCAAGATCAGCAACTT 

15 TGTCCCCACTCGAAGTGGCCACCCTCCTCCTGGCTCGGTCCCGGAATCCATCCTGCCAGAGGGTGTCAAAAAACCGG 
CTTTGTTCCAAACGTTGACATTGPCCTTTGCTGCACCGGAACAGGCGGGTAAGATGACCTTCAAGAACCGCATCATT 
GTCTCTCCCATGTGCCAGTACTCTGCGAACAATGGTCTTCCTACTCCGTACCACATTGCGCATTTGGGATCGTTTGC 
CCTGCACGGTGTGGGAAACGTCATGGTCGAAGCATCTGGTGTTGAGCCAGAGGGGAGGATCACGCCTCAGGACCTGG 
GTATTTGGTCGGAACAGCATCGG6ATGCACACAAGGCGCTGGTGXCGGTGCTCAAGTCCTTCACGGATGGTCTGGGT 

20 GTAGGGCTGCAACTGGCGCATGCGGGAAGGAAGGCCTCGGACTGGTCACCTTTCTACCGCGGAGAAAAGAAGCAAAA 
GTTTGTGACGCAGGAGGAAGGTGGCTGGCCGGATCGTGTCGTCGCTCCTTCGGCCATCGCATATGCGCAAGGTCACG 
TTACCCCTCGAGCTCTCACGACCGAGGACATCAACAAGTTGCAAGACAAATTCGTTCAGTCGGCACGATGGGCGTTT 
GAAGCTGGGTATGACTACGTCGAACTTCACAGCGCTCACGGATACCTGATGCACTCGTTCCTCAGCCCGTTGACCAA 
TCAGCGTACCGACGAGTACGGCGGTAGCCTGGAGAACCGCGCTCGATTTCTGCTCAACGTTGCCCGTCGAATCCGCC 

25 AAGAATTCCCCAACAAGGGTCTCTGGGTGCGCGTCAGCTCCACCGACTGGGCCGACCAAGCGCACCAAGCCGACTCT 
TGGACCGTTGACCAGACGGTTGAACTCGCCAAGATGCTCCAAGAGGCTCGAGTCGACCTGCTAGACGXCAGCTCCGG 
CGGCCTGGTTCCATTCCAAAAAATCACCGTGGGAGCCGGATACCAGCTATTCGGAGCAAAAGCCGTTCGCGATGCTC 
TGGCCAAAATCGAACCCGACGCGTCCAAACGCATGCTCGTCGGGGCCGTGGGAATGATGGAAGGTTCCTACGATTCG 
CCCAACGGCCAAGACCGCAGCCAGATTGGCAAGTTGGCCGAGCAGTCGATTCAGAGCGGAGAGTGTGATGCGGTACT 

30 GTTGGCACGTGGATTGATGTCCTACCCAAGCTGGACCGAGGATGCTAGTGTAGCGCTGATGGGTACCAGGGCAGCTG 
GCAACCCGCAGTACCATCGCGTTCACGTGGCTAAGAAGTGA 

* 

SEQ ID No. 85 

MDTSRFVSGLTPPLVDSIDALKISNFVPTRSGHPPPGSVPESILPEGVKKPALFQTLTLPFAAPEQAGKMTFKNRII 
35 VSPMCQYSANNGLPTPYHIAHLGSFALHGVGNVMVEASGVEPEGRITPQDLGIWSEQHRDAHKALVSVLKSFTDGLG 
VGLQIAHAGRKASDWSPFYRGEKKQKFVTQEEGGWPDRWAPSAIAYAQGHVTPRALTTEDINKLQDKFVQSM 
EAGYDYVELHSAHGYLMHSFLSPLTNQRTDEYGGSLENRARFLLNVARRIRQEFPNKGLWVRVSSTDWADQAHQADS 
WTVDQTVELAKMLQEARVDLLDVSSGGLVPFQKITVGAGYQLFGAKATODAIJ^IEPDASKRMLVGAVGM 
PNGQDRSQIGKLAEQSIQSGECDAVLLARGLMSYPSWTEDASVALMGTRAAGNPQYHRVHVAKK 
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CLAIMS 

1. Method of identifying an anti-fungal agent which 
targets an essential protein or gene of a fungus 

5 comprising contacting a candidate substance with 

(i) a NADH: flavin oxidoreductase protein which 
comprises the sequence shown by SEQ ID NO: 3, 

(ii) a NADH: flavin oxidoreductase protein which is a 
homologue of (i) and which comprises the sequence shown by 

10 SEQ ID NO: 8, 12, 14, 19, 24, 42, 44, 83 or 85, 

(iii) a protein which has 50% identity with (i) or 

(ii) , 

(iv) a protein comprising a fragment of (i) , (ii) or 

(iii) which fragment has a length of at least 50 amino 
15 acids , 

(v) a polynucleotide that comprises sequence which 
encodes (i) , (ii) , (iii) or (iv) , 

(vi) a polynucleotide comprising sequence which has at 
least 70% identity with the coding sequence of (v) , 

20 and determining whether the candidate substance binds or 
modulates (i) , (ii) , (iii)/ (iv) , (v) or (vi) , wherein 
binding or modulation of (i) , (ii) / (iii), (iv) , (v) or 
(vi) indicates that the candidate substance is an anti- 
fungal agent . 

25 

2. Method according to claim 1 wherein (iii) or (iv) have 
an oxidoreductase activity. 

3. Method according to claim 1 or 2 wherein (i) , (ii) , 
30 (iii) or (iv) comprise one or more of the motifs defined 

by regions 1 to 11 in Figures 1 and 2. 
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4. Method according to any one of the preceding claims 
comprising carrying out a redox reaction in the presence 
and absence of the candidate substance to determine 
whether the candidate substance inhibits the 

5 oxidoreductase activity of a protein as defined in any one 
of the preceding claims, wherein the redox reaction is 
carried out by contacting said protein with NADH or NADPH; 
and an electron acceptor, under conditions in which in the 
absence of the candidate substance the protein catalyses 
10 reduction of the electron acceptor. 

5. Method according to any one of the preceding claims 
wherein (iii) is a protein comprising the sequence of any 
of the following: SEQ ID NO: 6, 10, 16, 22, 27, 30, 33, 

15 35, 38, 40. 

6. Method according to any one of the preceding claims 
wherein the (i) or (ii) is an oxidoreductase of 
Aspergillus fla.vus; Aspergillus fumigatus ; Aspergillus 

20 nidulans; Aspergillus niger; Aspergillus parasiticus; 
Aspergillus terreus; Blumeria graminis ; Candida albicans / 
Candida cruzei; Candida glabrata ; Candida parapsilosis ; 
Candida tropicalis ; Colletotrichium trifolii; Cryptococcus 
neoformans; Encephalitozoon cuniculi; Fusarium 

25 graminarium; Fusarium solani; Fusarium sporotrichoides ; 
Leptosphaeria nodorum; Magnaporthe grisea; Mycosphaerella 
graminicola; Neurospora crassa; Phytophthora capsici; 
Phytophthora infestans; Plasmopara viticola/ Pneumocystis 
j iroveci ; Puccinia coronata ; Puccinia graminis; 

30 Pyricularia oryzae; Pythium ultimum; Rhizoctonia solani; 
Schizzosaccharomyces pombe; Trichophyton interdigitale ; 
Trichophyton rubrum; or Ustilago maydis . 
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7 . Method according to any one of the preceding claims 
which further comprises formulating the identified anti- 
fungal agent into a agricultural or pharmaceutical 
composition . 

8 . Method according to any one of claims 1 to 6 which 
further comprises killing or impairing the growth of a 
fungus by contacting the fungus with the identified anti- 
fungal agent. 

9. Use of (i), (ii) , (iii) . (iv) , (v) or (vi) as. defined 
in any one of claims 1 to 6 to identify or obtain an anti- 
fungal agent . 

10. Use of an anti-fungal agent identified by the method 
of any one of claims 1 to 6 in the manufacture of a 
medicament for prevention or treatment of fungal 
infection. 

11. Method of detecting the presence of a fungus in a 
sample comprising detecting the presence in the said 
sample of a protein or polynucleotide as defined in any 
one of claims 1 to 3, 5 or 6 - 

12. Method according to claim 11 wherein the sample is 
from an human, animal or plant individual who is suspected 
of having a fungal infection. 

13. An isolated protein or polynucleotide as defined in 
any one of claims 1 to 3, 5 or 6 . 

14. A vector comprising a polynucleotide as defined in 
any one of claims 1 to 3, 5 or 6. 
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15. A recombinant cell comprising a polynucleotide as 
defined in any one of claims 1 to 3;. 5 or 6 or a vector 
according to claim 14, 

16. A method of obtaining a protein as defined in any one 
of claims 1 to 3, 5 or 6 comprising expressing the protein 
from a polynucleotide as defined in any one of claims 1 to 
3, 5 or 6 or a vector according to claim 14. 

17. A method of obtaining a polynucleotide as defined in 
claim 1 to 3,. 5 or 6 comprising replication of a vector as 
defined in claim 14 or synthesis of the polynucleotide by 
condensation of nucleotides . 

18. An organism which is transgenic for a polynucleotide' 
as defined in any one of claims 1 to 3, 5 or 6, 

19. An organism which has been genetically engineered to 
render a polynucleotide or protein as defined in any one 
of claims 1 to 3, 5 or 6 non- functional or inhibited. 

20. An antibody which is specific for a protein as 
defined in any one of claims 1 to 3, 5 or 6. 

21. A method for preventing or treating a fungal 
infection comprising administering an anti-fungal agent 
identified by the method of any one of claims 1 to 6 . 

22. A method for preventing or treating a fungal 
infection comprising administering a protein or 
polynucleotide as defined in any one of claims 1 to 3, 5 
or 6 . 
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23, A method of killing, or impairing the growth of,, a 
fungus comprising inhibiting the expression or activity of 
a polynucleotide or protein as defined in any one of 

5 claims 1 to 3, 5 or 6. 

24. A method according to claim 23 wherein the fungus has 
infected a human, - animal or plant individual - 

10 25. A fungus which has been killed, or whose growth has 
been impaired, by inhibition of the expression or activity 
of a protein or polynucleotide as defined in any one of 
claims 1 to 3, 5 or 6. 



SfiQ 3 
SEQ 6 
SEQ 8 
SEQ 10 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ 19 
SEQ 22 
SEQ 24 
SEQ 2? 
SEQ 30 
SEQ 33 
SEQ 35 
SEQ 3S 
.SEQ 40 
SEQ 42 
SEQ 44 
SEQ 83 
SEQ 85 
Bacteria 
T44612 
HP 625402 
MP~29S913 
AF3202S4 
OYE family 
Af4875 
Af496X 
Ca2460 
Mc44S2 
ScOVBl 
SCOYE2 
SC0YE3 
A36990 



-MTVAD IDVEPAEGie 

MSQPWPD lENKPAPGIS 

MGSNAFRS PAVTKSSSTP 

MALPD VENTPAAGIP 

MTVPVQVKPS DEIKGAPEVS 



41 

YFTPAQHPPA 
YFTPAQBPPA 
YYTPANNGGA 
YFTPAQNPPA 
YYTPEQPVPA 



— ^MADETQKK TSSPAAPGVP FYTPAQVPAA 



— ^MSAEKK 

MX 

MTG 

-MAYBX 



TLSKPAAGVP 
IVNEGAENVG 
TANKAAPGVP 
ZDNVAAEGVP 



YYTPAQEPPA 
YETPAQKIPA 
FYTPAQBPPA 
YYTBAQDPPA 



GTAANEQTN- 
GTAAHEQSOG 
ALHPDDPT— 

GTAAMPQTSG 
GTFYPQSSD- 

MENN 

GTPLesTPG- 

MATST 

GTPLQQQDA- 
GAAIG-VP — 
GTPVDASTA** 
GTQPSG— — 



•—GOKIPKLr TPLTIR-GVT FQ KRLGZAPLCQ 

SAPPKLF RPLSVR-GLT EH KRIGL5PICQ 

TETLF RPLQIR-NVT LK HRIKV5PMCM 

MAVPKLY TPLTVR-GVT FH NRLGLAPLCQ 

EVAPKIF QPLKIG-KLA LP NRIGVSPMCQ 

NTIPALF QPIKISDSIT LP NRIGVSPMCM 

DVPTLF TPLKIR-GVB LQ KRFAVAPMCT 

TSDLKLS QPLTLPNGLT LP NRLVKAAMAE 

IPTLF KPLKIR-GVE LS NRFGVSPMCT 

QTKLF TPLKIR-GVB, EHFT NRMFVSPMCT 

BTLF KPLRIR-DLT IM— HRZWVSPMCQ 

STKLF TPITIR-SVT FP NRLFLAPLCQ 



-MP KCERMGHHKI IIMKEAPNVP FYTPVQDPEA GTSYDVQPES - 
-ARGI IDNIAAEGAP YYTPAQD.SA GTOTSGST — • 



SLF SLIKCR-HLT LQ- 

~— — KVF T.ITIR-GVT FE- 
. ~LKIR-GLT 



NRIFV5PMCQ 

HRLFLAPLOQ 

KRIMLRGLCQ 



MSPPRFEAA PADPSPLG TPLKY PVSGR~SAE NRFLNAAMSB 

MTVQSQQQSQ AIPVLSSQNG TEPQDANKEV VQNVAAKGVQ YFNPEQLPAP GLGINGPNNT LPKVF TPIKIR-GMT MP NRIWVSEMCQ 

MOTS RFVSGLTPPL VDSIDALKIS NFVPTRSGHP PEGSVPESIL PEGVKKPALF QTLTLP-FAA PEQAGKMTFK NRIIVSPMCQ 



— ^MSALF EPYTLK-DVT LR— 



• HRIAlSraCQ 

• NRIWMPEHCQ 



MTVSSAA AEQPASPAA- - 

■ RLRDAGWLE6 YERWLARKAG MTVRDDETP- - 



-MSALF BPFRLR-DTT IP-> 

PLLF TPLKLR-SLE LP NRWVSPMCT 

-9PEMF TPFKLR-GLT LA KRIVMSEMAM 



' MRBBPSSAQ- ' 
-MTI RKLDGEBSM- • 



-LF KPLKVG — RC HLQ- 



HBMIMAPTTR 



— MTVESTMS FWEAGTKQI EIAPLGSTK- 
MAATAAESR- 

MS FVKD- F KPWJiGDTM- • 

FVKD F KEQALGDTN- 

FVKG F EPISLRDTN- ■ 

-MTIBSTMS FWPSDTKtl 0VTPLGSTK- ■ 



LF QPLEIA-NGR IRLS HRWHAPMTR 

LF QPIKVG-KNI LP HRV»HAPTTR 

LF QPLKLTPKIT LG HRLBMAPLTR 

LP KPIKIG-NNE LL HRAVIPPLTR 



-LF KPZKIG-NNB LL 

-LF BPIKI6-NTQ LA— 
-LP QPIKVG-NNV LP— 



HRAVIPPLTR 

HRAVMPPLTR 

QRIAYVPTTR 
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SEQ 3 
SEQ 6 
SEQ 8 
SEQ 10 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ 19 
SEQ 22 
SEQ 24 
SEQ 27 
SEQ 30 
SEQ 33 
SEQ 35 
SEQ 38 
SEQ 40 
SEQ 42 
SEQ 44 
SEQ 83 
SBQ 85 
Bacteria 
T44612 
HP 625402 
HP"29S913 
AF3202S4 
OYE family 
Af487S 
A£4961 
Ca2460 
HC44S2 
ScOYBl 
ScOYE2 
SGOYB3 
A36990 



YSA— 
YSA— 
YSCB- 
YSA- 



QDGHM TD— YHIAHL 

DDGHM TP — WHMAHL 

S DPSSPKV6AL TH— YHLAHL 

EDGHM TD— YHIAHL 

YSA DYKFBA TP--YHLIHY 

YSS SPTDMQA TL— FHFVHY 

YSA DDGHM TD — WHLVHL 

QMG FGNHL PN — PELAAV 

YSA DDGHL TD — FHLVHL 

YSA DQEGHL TD— FHLVHL 

YSA DMGHA TD — YHLVHL 

YSA KDGYA TD~WHtTHL 



GGIAQRGPGL 
G6IAQRGPGF 
GHLALK6AGL 



6SLVHRGPGI 
GSFAVRGPAL 
GSFALRGVPL 
YATWARGDWG 
GQFALHGTAL 
GAMGMRGPGL 
GQFALHGAAL 
GGXIQRGPGL 



MLIEATAVQP 
LMVEATAVBP 
VFXEKTAVQP 
MMIEATSVSP 
TIVBSTAVSE 
IILESIFVSB 
TIFEATGVLP 
LILTGNVQVD 
TIVEATSVTP 
VMVEATAV5P 
SHVBATAVBA 
SMVEATAVQN 



E-GRITPQDV 
E-GRITPQDL 
H-6RISPHDS 
B-GRITEQDV 
B-GGLSPHDL 
N-SGLSIHDL 
N-GRITPECS 
HAHKGDAHDI 
H-GRISPEDS 
B-GRISPHDS 
R-GRISEBDV 
H-GRITEQDV 



-6LWK — DS 
-GLWK~DS 
-GLWQ — DG 
-GLWK— DS 
-GIWK— DE 
-GLWN— DO 
-GLWQ~DS 
-SPNH~PG 
-GLWQ — DS 



-6LWQ-~OS 
-GLWE— DG 



QIAPHR 

QIEPLS 

TTSEQFLGLK 

QIAPMK 

QAEKLK 

QAHSLR 

QIAPLK 

TTPEQTVTAF 

QIAPLR 

QMKPLR 

QIAPLK 

QIBPLK 



RVX-DFVHSQ 
RVI-EFVHSQ 
RW-EEMHAQ 
RVX~DFVHSQ 
PIV-DYAHSQ 
KIV-DFIHDQ 
RIV-DYIHSQ 
KAWADAARLN 
RIV-DYVHSQ 
RIV-BFAHSQ 
RIV-DFIHSQ 
RIT-TFAHSQ 



GQ-KIGV— Q 
MQ-LrGV--Q 
GA-KVGI~Q 
SQ-KIGV~Q 
KQ-LIAI — Q 
DG-ICCI~Q 
GQ-KAGI— Q 
GQSKTPVWQ 
GQ-KlAl— Q 
NQ-KIGI — Q 
NQ-VAAI — Q 
SQ-KIGI — Q 



YSA KDGVM TP — WHKQHL GSFAARGPGL rVTEVNAVSP E-GRISEEDA 

YSA KDGYA TD— WHLTHL GGIIQRGEGL SMVEATAVQN H-GRITEQDV 

YSA PDGHY TM— WHHTHM GGIIQRGPGL TCVEATAVTE Q-GRITPEDV 



-GIYD— DG QLGPLR DIV-DFVHSQ GA-KIAI~Q 

-GLWE— DG QIEPLK RIT-TFAHSQ SQ-KIGI— Q 

-GIWQ — DS QIEPLA KW-EFAHSQ NQ-KIMI — Q 



GLA— 
YSA — 
YSA — 

YMA — 
YSA — 
YSA — 
YSA — 



FRA DGQG VPLPFVQEYY GQRASVPGTL 

NRGVPLNPTS TPEQENRHTY PG-DLMVQYY RQRAT-PGGL 

FRA AKNHT PS-DLQLBYY KTHSQYEGTL 

DDE-HV PIVPLMTTYY SQRASVPGTL 

MBA LHPGHI PHRDHAVEYY TQRAQRPGTK 

MRA QHPGNI PNRDWAVEYY AQRAQRPGTL 

MRA THPGNI PNKBWAAVYY GQRAQREGTM 

FRA SKD-HI PS-DLQLNYY HARSQYPGTL 



IQTGNVMIDP 


EHLEAPGNMV 


— VPRD- 


AEP 


-SGERFDMFS 


IMLSATAVQA 


R-GRXTPEDS 


— GIWL- 


-DS 


HVEGLR 


VMVEASGVEP 


E-GRITEQDL 


— GXWS — BQ 


HRDAHK 


LWEATAVAE 


E-GRITEGCA 


— GI«S — OA 


HAQAFV 


IWEATGVSP 


E-GRX5PQDL 


— GLVH- 


-DT 


QVEAFR 


ILAEATAVSP 


B-XSRXTPEDL 


— GXMD- 


-DR 


QIVPLG 


LYTEMTCVSP 


D-ARITPGCA 


— GHYK- 


-PB 


HVNAWK 


LITEATDXTP 


K-AMGYKHVE 


— GIWS- 


-EP 


QREAWR 


IISEGVPPSL 


E-SNGMEGVP 


—GLWT- 


-PE 


QAAGWK 


IITEATFTSE 


Q-GGMDLHVE 


— GIYN- 


-DA 


QTKAWK 


LVTEATFISP 


A-AGGYDNVP 


— GlYN- 


-AA 


QIAAWK 


rXTEGAFISP 


Q-AGGYDNAP 


— GVWS- 


-EE 


QMVEWT 


IITEGTFPSP 


Q-SGGYDKAP 


— GIWS- 




QIKBWT 


IXTEGTFISP 


Q-AGGYDKAP 


— GXWS- 


-DB 


QVAEWK 


XXTEATFASE 


R-GGIDLHVP 


— GIYN- 


-DA 


QAXSHK 



EIV-SRVHSK KC-FIFC— Q 
RW-DAVHEQ GG-YIYC— Q 
KIM-DBIHAN GS-FSSM— Q 
KIT-DAVHAK GS-FIFC — Q 
KIF-NAIHEK KS-FVWV— Q 
KIF-KAIHEN KS-FAWV— Q 
HIF-LAIHDC QS-FAWV— Q 
KIH-EAIHGM GS-FSSV— Q 



SEQ 3 
SEQ 6 
SEQ 8 
SEQ 10 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ 19 
SEQ 22 
SEQ 24 
SEQ 27 
SEQ 30 
SEQ 33 
SEQ 35 
SEQ 3B 
SEQ 40 
SEQ 42 
SEQ 44 
SEQ 33 

SEQ as 

Baetoria 

T44612 

HP 62S402 

Np"'29S913 

AF32Q2S4 

OYE family 

Afi4B7S 

A£4961 

Ca2460 

Mc44S2 

ScOVEl 

SCOYE2 

SCOVB3 

A3 6990 



liAHAGRKATT VAPW 

lAHAGRKAST VAPW 

LAHAGRKASA VAPW 

lAHAGRKASN lAPW 

LGHGGRKASG QPLE 

LNHAGRKIVE 6VPF 

LAHAGRKAST KAPW 

INHEGRQSPM GAGT 

lAHAGRKAST KAPWHDSETP 

LAHAGRKAST TAPY" 

LAHAGRKAST LASH 

LSHAGRKRSC VSEW 



— — ISFS AXATEKVGGW 
— -.»-LSAN DTASEKMGGW 
'LAAQAGKSS LKADESVGGW 

LMNKG IVATEKVGGW 

LHLE QVADKSVNGE- 

QQXQHGW 

KYQRGKS ELAGPEQGGW 

RGLW 

S6BYKPREGL QWGPEYGGW 

RG-y TVATEAQGGW 

XTBARGK ALAQBSENGH 

LSVM AVAAEEVGGW 
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eORVKGPGD£ P — 
PGRVKGPTNV P — 

PADWGPSGG B 

PDRVIGPSTV P 

ADKAVAPSAL A — 
QEHCVGP3TB P — 
PENVWAPSAI i 



E-KAVAPSPV P 

PDDVWAPSAI P 

ENDVYGPFTM E 

PDDWAPSAI P— — 
PDNIVAPSAI A 



lOV MTLDEIE QFKK-DWVAA 

KE MTKQDIE DLKT-AWVAA 

RA LSTAEVR QWA-AFAKS 

KA MTKDDIE QFKR-DWFDA 

VPNE LTKDEIK RWK-DFGAA 

RE LTVNEIN SIVE-DFANA 

KE MTVEQIH ELVE-AWKAS 

RLLSKVLFGT PRELTVAEIK DIV-QKFAVT 

KB HTVEEIE GLVT-SFVDA 

HK LTEKQYD BLVD-KFWA 

— YTKDWATS RE LTTB.SR VWVK-KEAES 

• -QEHGVNPVP KA FTKEDIB QX.KS-DYVEA 



— EA8PFAKP 
— FTVKNPVP 
• SPEEDAYWVP 
— FHETFPTP 
—FRPNGNLP 
— FSDSHNTP 
-YNEETFPFP 
-LVLGEAFVP 
— FSEDFPWE 



IGHAGRKAST WPW 

LSHAGRKASC VSPW 

LAHAGRKAST VAPW 



-LDRK NTAF7 

-LSIN AVAAKBVGOT PDNIVAPSAI . 



QEAGVNPVP KA FTKEDIE . ELKN-DFLAA 

LSGG DV7VGEDVMGW PQDVWAPSAI P WHEKKAVP KE MSLDDXB AFKK-AFGEA 

LPS KRAGKEAGGW PEDWGPSGG EDFTWDERSS SDESGGYYAP RE LSVREIK EMVQ-DWATA 

QHPISASD VQLKQEM FGSKFGVP RP ^ATKEDIK AVIE-GFAHT 

LDAG LAAEKAAfiGW PDDWGPSNB P — FAPGYPTP RA ITLEEIE QLKE-DFVSG 

— YAQGHVTP RA LTTEDIM KLQD-KFVQS 



VGHPGRQARG SVQ 

IGHAGRKASC VAPW 

CiAHAGRKASD WSPF — -YRGEKKQ KFVXQEEGGH PDRWIVPSAI A- 

XAHAGRKA5A KRPW 

LAHAGRKAST AQPW 

LAHAGRKAST YAPW 

LGHAGRKGAT KLAW 



-EGDD HI6A0DAAGW — ETXA9SAI A— 
— RGG APVGADAYGW — QPLAPSAL A — 
— RGK GAVPAELGGW — QVIGPDEN S — 
EG IDEPLEAGAW — ELISASPL P — 



-F6ARLPNV PRA- 

-FDERHPVP TE 

-FHDLFPTP AM 

-YLPHSQVP RA 



LWATGRAADP DVIA — DMK~D LISSS-AVPV BEK6P 

LWHASRATZP QHTG SPAVSAS ATVHDSPTEC Y5HPP 

LWYLGRVRHP KDLK DAGLPL IGPSA— VYW DEESE • 

LWSLGRAANP BVLA KEGGLK LKSSS-AVPM EEGAP • 

LWVLGWAAFP DNLA RDG-LR YDSASDNVEM DAEQE- 

LWVLGWAAFP DTLA RDG-LR YDSASDNVYM NAEQB- 

LWSLGWASFP DVLA RDG-LR YDCASDRVYM MATLQ- 

LWYLGRVANA KDLK DSG-LP LIAPS-AVYW DENSE- 



MTLDDIA RVKQ-DFVDA 

LTVPQIQ EAVG-RFADA 

MGADELR GWD-AFSAA 

MTRDDME RVRN-DFVRA 



-LP RA Z/TEDEXQ QCCA-DFAQA 

LTZP-HL KQXXRDYCHA 

LTEKEXD HIVEVEYPMA 

■ MTVAEIK BRVA-BYAAA 

■ LTKDEIK QYIK-EYVQA 

XTKDEIK QYVK-BYVQA 

-LTKDDIK QYIK-DYIHA 



— ^VGST BPVRYADHPP IE 

KLAKSVGNEL RE 

VP EE 

— — AKAKKANNPQ HS 

• EKAKKANNPQ HS 

' EKAKDANNLE HS 

■ KLAKEAGNEL RA LTEEEID HIVEVEYPMA 
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311 



321 



331 



341 



361 



371 



381 



391 



SEQ 3 TKRAIAA-GA DFVEtHNAHG YLLSSFLSP AANNRTDQY G-GSFEKRIR LSLBXAQLTR DAVGPHVP VFLR ISAS-DWCE ETLPEQ 

SEQ 6 VKRAVKA-GA DFIEIHNAHG YLLMSFLSP AVNTRTDEY G-GSFENRIR LSLEIAKLTR ENVPKDMP VFLR VSAT-DWLE EVQPNKP 

SEQ 8 ARLAVQA-GV DVIBIHGAHG YLINEFLSP VTNKRTDAY G-GSFENRTR IVREVAAAIR AVIPEGMP LFLR ISAT-EWLE GQPVAAESG 

SEQ 10 CKPAIAA-GA DFIEIHNAHG YLLSSFLSP SSNTRTDEY G-GSFENRIR LSLEIAQVTR DAVGPNVP VFLR VSAT.-DWIB ETLPSE 

SEQ 12 ARRAVEISGF DAVEXHGAHG YLINEFYSP ISNKRTDEY G-GSFENRTR FLKEVIDSVK SSIPNDVP — VFLR ISAA-ENSP DPE 

SEQ 14 AWRAVEISKF DAIBIHCAWG CLIHQFLSK LTNKRADQY G-GSFENRVR FLLQIIENIK RKIET — P IFLK FPMS-ONCS DPE 

SEQ 16 AQRALKA-GF DLIBIHAAHG YLISEFLSP ISNQRTDQY G-GSFENRTR VLREIISAVR SVIPBDMP LFVR VSAT-EWME YTGQP- 

SEQ 19 ARXTAEA-GF N6VEIHAAHG YLLAQFLSK KTNRRGDEY G-G5AENRAR IVGEXIKECR RQVTEAV6BB EAKKFWGIK LHSA-DWQA GROGKEEBE 

SEQ 22 AKRAIEA-GV DIIEIHGAHG YLITBFtSP LSNKRTDKY G-GSFEHRTR VLIDIIKAVR AVIPEEM ,* PLFVR ISAT-EWME YAGEP 

SEQ 24 AKBAVEI-GP DVIBIHGAHG YLISSTV3PA ETTNDRHDKY G-GTFBKRIL PEMEWHSVR KAIPDSMP LFYR VTAT-DWLP K6Q— 

SEQ 30 AKRAIHA-GF DVIBIHAAHG YLLHQFLSP VSNQRTDEY 

SEQ 33 TDEY G-GSFENRIR WLBILDLIR AAIPETTE VLVR VSAT-DWFEF DSQFKDEFPE 

SEQ 35 

SEQ 3B .KRA.RA-GF DVIEIHAAHG Y.LHQFLSP VSNQRTDEY G-GSFEMRIR WLBIX 

SEQ 40 VKRALKA-GF 0VIBIHNAH6 YLLHEFICL RATPGBTST G-GSWENRTR LTMBSRRPCP QH? 

SEQ 42 AKRAVKA-GV DVIBIHGAHG YLIHEFLSP XTNRRTD5Y G-GSFEHRTR LLIEXVTAVR AAMP5SMP LFUl LSST-EWMB DTDXGKKFG 

SEQ 44 AEYLEKA-6F D6IELHAAHG YLLAQFIiSB TTNQRTDEY G-GSLEHRMR LILBVTAEVR RRTSKNF ILGIK IMSV-BFQE KG 

SEQ 83 VRRAVEA-GF DTXDFHFAHG YLVSSPLSP ^ATNKRTDKY G-GSFEMRVR LALEIVEAAR AVMPEDMP LFTR ISGT-DWLE NNPEYEGE 

SEQ 85 ARWAFEA-GY DYVELHSAHG YLMHSFLSP LTNQRTDEY G-GSLENRAR FLLNVARRIR QEFPNKG LWVR VSST-DWAD QAHQAD 

Bacteria — — — 

T44612 ARRARDA-GF EWIELHFAHG YLGQSFFSE HSNKRTDAY G-GSFDNRSR FLLETLAAVR EVt>JPENLP LTAR FGVL-EYDG RD 

NP_62S402 ARRALAA-GF BIAEIHGAHG YLIHEFLSP HSNQRTDAY G-GSYANRTR FALEWDAVR EVWPDDKP LFFR VSAT-DWLE- EG 

MP~29S913 ARRAQVA-GF DAVEVHAAHG YLLHQFLSP LANTRTDDY G-GSFENRTR LLLEWRAVR HVWPAHLP — LFVH LSAT-DWAE G 

AF3202S4 TRMAAEA-GF OILELHCAHG YLLSSFLSP LTNRRTDEF G-GDLEMRAR FPLEVFKAMR AMWPTNRP MSVR LSCH-DWFP G 

A£4B75 ARNAINA-GF 06VEIHGAH6 YLXOQFTQK SCHHRQDRH G-GSIBHRAR FAVBVTRAVI BAVGADR VGVK LSPY-SQYL GK6TMD 

Af4961 AKTAMEX-GF DGVELHAGNG YLPEQFLSS MVHKRTDEY G-GSPBKRCR- FVLELMDELA ATVGEDN LAIR LSPF-GLFH QARG 

Ca2460 AKRAIEA-GF DYIEVHSAPG YFLDQFLNP ASNKRTDKY G-GSIENRAR LLLRIIDKLI GIVGABK LAVR LAPW-SSFL GMEIEG 

NC44S2 AKNAVEA-GF DGVEIHGANG YLIDQFLQD TCNQRTDEY G-GSIENRSR FAHEWKAW EAVGAEK TGIR LSPY-STFQ GMKMKK 

ScOYEl AKNSIAA-GA DGVEIHSANG YLLNQFLDP HSNTRTDEY G-GSIBNRAR FTLEWDALV EAIGHEK VGLR LSPY-GVFN SMSGGA 

SC0YE2 AKNSIAA-GA DGVEIHSANG YLLNQFLDP HSNMRTDEY G-GSIENRAR FTLEWDAW DAIGPEK VGLR LSPY-GVFN SMSGGA 

SeOYE3 AKNSIAA-GA DGVEIHSANG YLLNQFLDP H5NKRTDEY 6-GTIEHRAR FTLEWDALI BTIGPBR VGLR LSPY-GTFH SMSGGA 

A36990 AKHALBA-GF DYVEIHGAH6 YLLDQFLHL- -ASNKRTDKY GCG5IEHRAR LLLRWDKLI BWGANR — LALR LSPH-ASFQ GMEIEG 



SEQ 3 
SEQ 6 
SEQ e 
SEQ 10 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ 19 
SEQ 22 
SEQ 24 
SEQ 27 
SEQ 30 
SEQ 33 
SEQ 35 
SEQ 38 
SEQ 40 
SEQ 42 
SEQ 44 
SEQ S3 
SEQ 85 
Bacteria 
M4612 
HP €25402 
NP~29S913 
AF320254 
OYE family 
Af4875 
Af4961 
Ca2460 
Nc44S2 
ScOYEl 
ScOYES 
SGOYE3 
A3 6990 



SEQ 3 
SEQ 6 
SEQ B 
SEQ 10 
SEQ 12 
SEQ 14 
SEQ 15 
SEQ 19 
SEQ 22 
SEQ 24 
SEQ 27 
SEQ 30 
SEQ 33 
SEQ 35 
SEQ 38 
SEQ 40 
SEQ 42 
SEQ 44 
SEQ 83 

SEQ as 

Bacbsria 

T44612 

NP 62S402 

Np2295913 

AF320254 

OYB family 

Aff487S 

Af4961 

Ca246Q 

Nc44S2 

ScOYEl 

SeOYEZ 

SCOYE3 

A36990 



SWKSEDTVR- 
SWRGVDTVR- 
SWDM-QSSL- 
SWKLSDSVR- 
AWTIEDSKK- 
AWSTBDALK- 
SWDLQQTI — 
TDTAEEVLK- 
SWDLEQSTQ- 
GWEIEDTVAF 



EJVQELVK—Q 
-PAKILA-BT 
ELVKKLP — E 
FAEALAA— Q 
-LAD1LV~E 
-LADLVI— D 
ELAKILP—D 
-QIELFB— Q 
-LAKLLP— D 
TIAARLR.--0 
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GAVDLZDISS 
GYVDVL0V3S 
WGIDIVDVSS 
GAIDLIDVSS 
KGIALVDVSS 
LGVKVIDVTS 
I.GVDLLDVSS 
HGIDEVEVS6 
LGVDLLDV5S 
GSVDU0VSS 
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GGVLAQQ- 
GGXHSEQ- 

AAKHKDQ 

GGVHAAQ 

GGNDYRQPP- 
GGNVAHCKS- 

GGNNKDQ 

GSYEOPQHAH 

G6HSVAQ 

GGNHKDQ 
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■KI K5GPAFQVPF 
HI HAKPGFQAPF 

KI MLHTAYQTDL 

KI KSGPAFQAPF 

RSGISK ELREPIKVPL 

RYLLND DKQLPSQVPL 

KI NVHTYYQIDM 

GPKPEXSERT MAREAFFLSF 

Id ELTPYYQIDL 

RI BVKOCYQVPF 



AVAVKKAVGD KLLVAAV GAIT- 

AIAVKNAVGD KLAVASV GMIA 

AGQIRQAI BAAGAST LVGAVGLITD SEQARGLVQG 

AVAIKKAVGD KLLVATV GTIT 

SRAIKQKVGD KLLVSCV GGLE 

ARKLKSKIRN RCLIACS GGLD 

AEQIRAAVHE AGKQLLVGAV GLVT^ SA BIAKETVQEK 

AKIIRTK FPKLPLMVT GGFR 

AAKtREAVGD RLLIGAV GMIH 

AEKIKDQVNG ILLGAV GMIR 



SWTVEQTC — QIiABILP — K HGVDLVDVSS GGIHPKS ^AIAI KSGPAYQVDL AI^VKKAVGD SVLVSAV GGZK— 



SWDVESTIK ISKILA— D LGVDLLDVSS GGNHPQQ 

FKP'EEAVQ- LCEALEAAQI — 'OEVETSG GTYESF6 

TWTLEQSIK LAKQZA — D RGVDVLDVSS 6GZKKMQ 

SWTVDQSTVE- LAKHLQE ARVDLLDVSS GGLVPPQ 

EOfFLBESI— ELARRFK— -A GGLDLE.SV5V GPTISET- 
GWTPDDTVR- -FARDLE— A HGIDUiOVST 6GHVPRV 



KI NMFNT ■ 

— FAHRKESS RKREKYFIBF ABVIRKAVKH • 
^KV^ AAGP6YQAPL i 



-KI TVGAGYQLFG 



AKAIKKSVGD 

AKAVRDALAK — ; 



■-MWYTTG GFKT- 
■KHLISW 6SIK- 
lEPDASKR MLVGA- 



HI PWGPAFMGPI AERVRREAKL • 

RI PTGPGYQVPF AARVKAGST- - 

GWDLEQTVQ LSKLLK — Y' BGVDVLDISS GGLTAAQ QI EVGPGYQVPF AAAVSRAETE - 

GNTADDAVA lARLFK — E AGADIIDCSS GQVWKGD QP VYGBMYQTEF ADRIRNEVGI - 



— EVTSAW GFGT — 
— LPVAAV GLIT — 

ISVMAV GLIE — 

PTLAVG AISE — 



EI.~VPQFEY LIA QM RRLDVAYLHL ANSRWL 

EQR-VETWTF LCESLKKAHP NLSYVSP lEPRYE 

BE IHSY ILQQLQQRAD NGQQZAYVSIi lEPRVIG 

DLIP— QFED VIRKIH GFGLAYLHL TQSRVAGK- 

ETGIVAQYAY VAGELBKBAK AGKRLAFVHL VEPRVTNP- 



-DE EKPHPDPMHE VFVRVWG-Q SS-PILLA GGYD- 



-QieSYEEKD MFLRSWG LSDVDLSSFR KIFGTPPFFS 

IFOASIi EDQKGRSNEF AVKYWKG BFVRA SMYT— 

MDVQP EEDEE-HIAP AAKLWDG PLLIA GGLT 

FLTEGB GEYEGGSNDP VYSIWKG 9VIRA GHFA 

BTGIVAQYAY VLGBLERRAK AGKRLAFVHL VEPRVTNP FLTEGE GEYNGGSNKF AYSIWKG PIIRA GHFA 

EPGIIAQYSY VLGELBKRAK AGKRLAFVHL VEPRVTDP SLVEGE GEYSEGTNDF AYSIWKG BIIRA GNVA 

EE IHSY ILQQLQQRAD HGQQLAYISL VEPRVTG lYDVSL KDQQGRSNEF AYKIWKG NFIBA GHYT 
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— NGKQ — AH 

SAHLAMS 

ADEATAAEAM 
— NGKQ~AN 

KDPELLN 

RDIFKLD 

E-DGRVTIQR 

TRQGME 

—TADI — AR 
— DGLFTTAK 



QILEEQD • 

LLEKD6 • 

LSGPEPK ■ 

KLLEEEG— 

KYLEEGT 

EFIANGD 

ENGAKTR — 
AALESDD- 



IDWRLVB RGPQKDPGLA. WTERQHLGV- • 

LDLVLVG JIGFQKHPGLV WAWADELNV- • 

■ ADAILIA RQFLREPEWV FSTARKLGV- • 

LDVALVG RGFQKDPGLA WTFAQHLDV- ■ 

FDLALIG RGFLRNPGLV WEFADKLGV- ■ 

FDIALIG KGFLKNTGLI 'SRIADQLQA- ■ 

^ADMVLVA RQFLKEPEFV LTVADBLGV- ■ 

-CDMIGIG RPAIINPSLP ANLILNPEV- ■ 



DWDEQGABK VAEAKQTHDT lEWSBSHGG KTKADLVLIA RQFLREPEFV LRTAHNLGV- 
EILESGK ADVTFVA REFLRHPSLV LDSANQLGE 



EISMAN QIRWGFTRRG 

EISMAH QIRHGFSRRG 

PVTVPV QFGRAI 

EIAMAS QIRWGFTRRG 

RLHQAL QLGWGFWPNK 

QFRTAP QYKLALS 

DVKAPV QYLRGPLSSR 

PDAOAR LFDKKRAEPH 

HVQWPH QYHRAVWRKG 

NVWHPV QYDYAVKGHR 




— VGAM-VDA LQGVDG 

— IGTL — ^AE EIIAGG 

— VGMM — EG SYDSPNG ■ 

-PQLAE AALQAMQ 

-EPG QAE KILANGE 

— TGA— QAE AILQAGD 

AD--HAN SIIAAGR 

AASAEKVTEQ MAAATYT- 
AGGWDQSMSW GVLEBGR — - 

-YDAPEFKTL LHDLDMD 

-PBTAK-HLV DREFPEK 

-LHP EW RBEVKDK 

-LHP EW REEVKDP 



IGIG RAAGSBPDLA KDIZAGKVSS IIKYAMGEDE FVLQLTAC5A QZRIMAKGEE 

-ED DTPLDLVASG RLFQKNTGLV W5WADDLHT SIQIAH QIAMGFGGRA 

■-QDRSQIG KLAEQSIQSG ECDAVLLAR 6LMSYPS WPEDASVALM GTRAAGHPQY 



LDLV5VG RAHLADPHWA YFAAKELGV 

^ADAVLLG RBLLRNPSWA QHAARELGV 

^ADLIALG RPFLRDPHWA QRAARELGL 

^ADLCAIA RPHLADPAWT LHEAAKIGF- - 

-NVAIAF6 RYFISTPDLP FRVMAGIQL 

-YEALLYG RYFTSHPDLV BRLRKGIPF 

-RTIVGPA RPFTSMPDLV EKLKLGKPL 

-DWATFG RHFISTPDLP FRIKBGIEL 

-RTLIGYG RFFISNPDLV- DRLEKGLPL 



-EKASWT LPAPYAHWLE 

-DABMPD QYGWGM 

-RPVSIO QYARAlSn 



RTLIGYG RFFISNPDLV DRLEKGLPL 

-tHP EVV REQVKDP — — RTLIGYG RFFISMPDLV YRLEEGLPL- 

-YDAPEFKTL IHDLKND RSIIGPS RFPT3NPDI.V EKLKL6KPL- 



6BUAHP KQYRSARGQY 

QKYDRA SFYSTLSRBS 

TPYDRS RFYGPPEDKA 

HHYDRE BFYKYYNY-G 

■— — -NPYDRD TFYKAKSPDG 

NKYDRD TFYQMSAH-G 

— NKYDRD TFYKMSAE-G 
— NKYDRS TFYTMSAE-G 
— HYYHRE EFYKYYNY-G 



SEQ 3 
SEQ 6 
SEQ B 
SEQ 10 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ 19 
SEQ 22 
SEQ 24 
SEQ 27 
SEQ 30 
SEQ 33 
SEQ 35 
SEQ 30 
SEQ 40 
SEQ 42 
SEQ 44 
SEQ 83 
SEQ 85 
Bactsria 
T44612 
NP__6254Q2 
NP_295913 
AF320254 
OYE family 
Af4a75 
Af4961 
Ca2460 
HC44S2 
ScOYBl 
ScoyB2 
ScOYBS 
A36990 



CPPPYIDESVlf I^SIFDV 

AGPYLRXKLE KI 



GTFyZDSKAY KESIFE 

QQIVDLIERT SXLBVH- 



PKKLTTVP 

WIVEKLGMKS IVGAGVEVTW WSELKKLAK F— 
ARI 



KXVHKSSL— • 



PFDISNADEV ARVTQLMAEG KV— 

KKNAPKLVL 

HRVHVAKK ' 



ETNLQRAAAA VAGK- 



YLDYPFSAEY 
KCYVDYPPAT 
YNSYDESEKQ 
IflDQPFSKBF 
YIDYPTYEEA 
YIOYPTYEBA 
YTDYPTYEBA, 
YNSYDESEl^ 



MALHNFPV— 

ASS 

VIGKPLV— 

BKVYGAQA— 

LKLGWDKK— 

ZtKLGWDKN- 

VDLGWNICH— 

VXGKPLA. — 



Figure 1. A multiple alignment of the 2031 OR amino acid 
sequence from A. fumigatus (SEQ ID NoS) along with related 
2 031 ORs from other fungi and bacteria (see Example 4) and 
OYEs . Regions 1-11, marked with * or #, refer to amino acids 
conserved between ORs but not OYEs . 

5'ungal 2 031 ORs are given by the following SEQ ID No. : A. • 
fumigatus, SEQ ID Nos . 3,. 6 and 8; A. nidulaxis, SEQ ID No. 
10; C. aljblcans SEQ ID Nos . 12 and 14; J\r, cxassa., SEQ ID 
Nos. 16 and 19; M. grisea SEQ ID Nos. 22 and 44; S. pombe 
SEQ ID No. 24 (NP_595868) ; C. trifolii SEQ ID No. 27; F. 
spojrotirichioides SEQ ID Nos. 30, 33 and 35; F. grraxninearum 
SEQ ID Nos. 38 and 83; M. graminicola. SEQ ID Nos. 40 and 42; 
U. ma.ydis SEQ ID No 85, 

Bacterial ORs resembling 2031 are: T44612 {P3eudomona.s 
putida) ; NP_625402 (S^treptomyces aoelicolor) ; NP_295913 
{Deinococcus radlodurajis) ; AF32 0254 {Azoarcus evansii) . 

Fungal ORs similar to the Old Yellow Enzyme family 
(originally identified in S. carevisiae) : A, fumigatus , 
Af4875 and Af4961; C, albicans, Ca2460 and A36990; N. 
crassa, Nc4452; S. cerevlsiae, OYEl, 0YE2 and OYE3 . 

Details of the sequence searches that identified the ORs 
other than SEQ ID No, 3, and methods for the construction 
of multiple alignments are given in Example 4 hereinafter- 



SEQ 



SEQ 1 GTTCGftCGTC ATTGCCftCGT TTCGACCCftA GGGCRGZiCGC CATGTCGCCG AGCGATCGCC GCGATATGCC TCGAATTTGC GCCMTCGGC ATCCAGTTTC 

SEQ A — " r ~ ** 

SEQ 5 

SEQ 9 1 II nil" 

SEQ 18 

SEQ 28 r 

SEQ 29 

SEQ 32 

SEQ 34 " ~" 

SEQ 37 ~ 



CAGTGCCCTT CCCCGAATGA CTGTCTCCAC TATTCGGCAA GATTGTAAAT CAAGCCTGAA GAAGCGGAGC AATTCTTGGA AGTCGTATGT TCTACTGATT 

SEQ 2 — - - ' GTATGT TCTACTGATT 

SEQ 4 I~III 

SEQ 20 T 

SEQ 21 ' , 

SEQ 23 

SEQ 25 CGAAA CCTCGACCCA AACAAACAGC 

SEQ 28 GAAC 

SEQ 29 

SEQ 32 

SEQ 34 ^AGGJiAG TTGCATGTCA CTTGTAGTGA CAGGGCGTCG TGTAAATTTT ATAAATACCT ATACTTGTTT GTTCMTTCT ATGCTACTCA TATCAATCCG 

SEQ 36 IIIIIII IIIIIIIIII 

SEQ 82 ' 

SEQ 84 



SEQ 1 TCTGTGCCTG GCGOWSACGG GTATATAAAT AAAGATCACC ' GCACCGAGGA GTTTCTTACC JU«:CCATCAA TAACCaTCCA CaATCTCCTA CAACAWUVAT 

SEQ 2 TCTGTGCCTG GCGCAGACGG GTATATAflAT AAAGATCACC GCftCCGAGGA GTTTCTTACC AACCCATCZ^. TARCCMCCA CflAICTCCTA CAAC2UVAAAT 

ggQ 4 • A TGTCGCAACC 

SEQ 7 IIIIIIIIII IIIII_II ^A TGGGTTCCAA 

SEQ 11 IIIIIIIIII IIIIIIIIII -r ATGACAG TTCCATACCA' 

SEQ 13 — — " ~ " 

ggQ 15 .^^^ • A TGGCCG2«:TT 

SEQ 17 

SEQ 18 — " ATGTC 

SEQ 21 IIIIIIIIII II 1 ATGTC 

SEQ 23 

SEQ 25 TGACCCTCTC CTTGACAACA AAGCCGGCCA TCCTCGCCGA CGATTGCCTC TACCCCCGCA TAGTCACACT CGCACGTCCG ttctcccacc gtcaaacaga 
SEQ 26 



SEQ 28 TGCTGTAGAT GTGGTTGftAT TGGTATATTA GJVCCGGAGTA CTCTATATGC GRGAGAGTAT ACATTGAAGT TGCCAACGTT CTTCCAGATT GATTAATCAT 

SEQ 29 ^'^ 

SEQ 32 

SEQ 34 AGJU«3ATCAA ACAGTCCCCT ATACACACTT GTCAAGACCT ATCTATTATT TCAAAAATCA GCAATATGGC TGAGACAATG CCTAAGTGTG AGGCAAATGG 

SEQ 35 

SEQ 37 

SEQ 39 : 

SEQ 41 

SEQ 43 " 

SEQ 82 ATGACAG TTCIU^TCACA GCAACAATCC CAGGCTATTC CCGTCCTTTC TTCCCAGAAT GGCACTGAAC CCCAAGACGC 

SEQ 84 ^AT GGACACGTCT CGRTTCGTGT CTGGTCTCAC 



301 311 321 331 341 351 361 371 3S1 391 

-.M^MMMMi... — — ****** ********** ********** ********** 

SEQ 1 GJiCTGTCGCC GATATCGACG TTCCTCX:TGC CGAGGGCATC CCCTACTTCA CTCC66CCCA. GRTkCCCTCCT GCCGGTACGG CAGCTAACCC CCAGACCAAT 

SEQ 2 GACTGTCGCC GATATCGACG TTCCTCCTGC CGAGGGCATC CCCTACTTCA CTCCGGCCCA GBACCCTCCT GCCGGTACGG CAGCTAACCC CCAGACCAAT 

SEQ 4 TGTTGTGCCT GACATCGAGA ACAAACCCGC GCCGGGTATC TCGTACTTTA CTCCGGCGCA AGAGCCCCCT GCTGGCACCG CTGCTAATCC TCAGTCTGAT 

SEQ 5 TGTTGTGCCT GACATCGAGA ACAAACCCGC GCCGGGTATC TCGTACTTTA CTCCGGCGCA AGAGCCGCCT GCTGGCACCG CTGCTAATCC TCAGTCTGAT 

SEQ 7 CGCCTTCCGG TCCCCCGCCG TCACCAAGTC CTCCTCCACC CCCTACTACA CTCCCGCCAA CAATGGAGGC GCCGCCCTGC ACCCCGACGA CCCCAC 

SEQ 9 GGCTCTCCCT GACGTCGAAA ACACCCCCGC CGCCGGCATC CCCTACTTTA CACCAGCACA GAACCCTCCT GCTGGAACAG CTGCCAACCC GCAAACCAGC 

SEQ XI AGTAAAACCA TCAGATGAAA TCAAAGGTGC TCCTGAGGTT TCCTATTACA CTCCAGAACA GCCTGTTCCG GCTGGTACTT TTTATCCCCA ATCGTC A 

SEQ 13 -: ^ATGGAAA ACAACAATAC TATACCG 

SEQ IS CACCC2U3AAG AAGftCCTCCT CCCCCGCGGC CCCGGGTGTT CCCTTCTACA CCCCGGCCCA GGTCCCCGCC GCCGGCACTC CCCTCCCCTC CACCCCC 

SEQ 17 ATGGCTACTT CCACTACCTC CGRCCTC 

SEQ 18 ATGGCTACTT CCACTACCTC CGRCCTC 

SEQ 20 GGCAGAAAAG AAGACTTTGA GCAAACCGGC CGCCGGGGT6 CCTTACTACA CCCCAGCCCA GGAGCCGCCG GCAGGGACCC CTTTGCAGCA GC2VGGACG — 

SEQ 21 GGCAGAAAAG AAGACTTTGA GCIUUVCCGGC CGCCGGGGTG CCTTACTACA CCCCAGCCCA GGAGCCGCCG GCAGGGACCC CTTTGCAGCA GCAGGACG — 

SEQ 23 ^ATGAC TATTGTTAAT GAAGGAGCCG ARAATGTTGG TTATTTTACA CCTGCGCAAA AAATACCAGC TGGAGCGGCG ATAGGTGTAC CGCAAA 

SEQ 25 CAGCATGACG GGCACCGCGA ACAAGGCCGC CCCCGGTGTG CCGTTTTACA CCCCGGCCCA GGAGCCTCCC GCGGGAACGC CAGTCGACGC CAGCACGG— 

SEQ 26 ^ATGACG GGCACCGCGA ACAAGGCCGC CCCCGGTGTG CCGTTTTACA CCCCGGCCCA GGAGCCTCCC GCGGGAACGC CAGTCGACGC CAGCACGG — 

SEQ 28 GGCTTACGAG ATAATCGACA ACGTTGCGGC TGAAGGGGTT CCATATTACA CACCGGCTCA AGACCCGCCA GCTGGTACGC AGACAAGCGG CTCAACG 

SEQ 29 GGCTTACGAG ATAATCGACA ACGTTGCGGC TGAAGGGGTT CCATATTACA CACCGGCTCA AGACCCGCCA GCTGGTACGC AGACAAGCGG CTCAACG 

SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 39 
SEQ 41 

SEQ 43 ^ATGT CCCCACCACG CTTCGAAGCG GCCCCTGCCG ACCCCTCACC GCTCGGC 

SEQ 82 AAACAAGGAG 6TTGTTCAGA ATGTCGCTGC CAAAGGAGTG CAATACTTCA ACCCTGAGCA ACTTCCTGCA CC2U36TCTCG GTATAAACGG TCCCAAT 

SEQ 84 ACCGCCTCTC 6TCGACTCGA TCGATGCACT CAAGATC2\GC A2\CTTTGTCC CCACTCGAAG TGGCCACCCT CCTCCTGGCT CGGTCCCGGA ATCCATCCTG 



CCATCACAAA 
-GCACGAGGG 
-GCACGAGGG 


ATCATCATCA ATAAGGAAGC 
ATTATTGACA ACATCGCGGC 
ATTATTGACA ACATCGCGGC 


TCCGAATGTT 
TGAAGGGGCT 
TGAAGGGGCT 


CCTTTCTATA 
CCCTACTACA 
CCCTACTACA 


CTCCAGTGCA AGATCCACCA 
CGCCTGCTCA AGACYCTCCA 
CGCCTGCTCA AGACYCTCCA 


GCAGGAACGT 
GCAGGCACAC 
GCAGGCACAC 


CTTACGATGT 
AGACCAGCGG 
AGACCAGCGG 


TCAGCCTGAA 
CTCAACCA — 
CTCAACCA — 





SEQ 1 GG CC AGAAGATCCC CAAGCTCTTC ACGCCCTTGA CCATCCGTGG CGTCACC — — — TTCCAGAAC CGCCTTGGTG 

SEQ 2 GG CC AGAAGATCCC CAAGCTCTTC ACGCCCTTGA CCATCCGTGG CGTCACC ^ TTCCAGAAC CGCCTTGGT- 

SEQ 4 GG AT CGGCACCTCC CAAGCTCTTC CGGCCGCTTT CGGTGCGGGG TCTGACC TTTCACAAT CGCATTGGCG 

SEQ S GG AT CGGCACCTCC CAAGCTCTTC "CGGCCGCTTT CGGTGCGGGG TCTGACC TTTCACAAT CGCATTGGC- 

SEQ 7 GACCCC TACGCTCTTC CGGCCCTTAC AAATCCGCAA TGTGACG -CTCAAGAAC CGCATCATG- 

SEQ 9 GG CA ATGCCGTCCC CAAGCTGTAC ACACCTCTGA CGGTGCGTGG GGTGACC- TTCCACAAC AGACTTGGC- 

SEQ 11 GA TG AliGTTGCTCC CAAAATTTTT C2iACCTTTAA AGATTGGTAA GCTTGCT TTGCCAAAC AGAATTGGG- 

SEQ 13 GCATTATTT CAACCCATAA AGATCAGTGA CTCGATC AC ATTACCTAAT AGAATTGGT- 

SEQ 15 G GCGATGTCCC TACTCTCTTC ACCCCTCTCA AGATCCGTGG TGTTGAG CTCCAGAAC CGCTTCGCC- 

SEQ 17 AARCTCTCC. CAACCCCTCA CCCTCCCCAA TGGCCTT ^AC CCTCCCCAAC CGCCTCGTC- 

SEQ 18 ^AAliCTCTCC CAACCCCTCA CCCTCCCCAA TGGCCTT ^AC CCTCCCCAAC CGCCTCGTC- 

SEQ 20 CCATCCC AACGCTGTTC AAGCCTCTGA AGATCCGTGG CGTCGAfi CTCTCCAAC CGCTTTGGC- 

SEQ 21 CCATCCC AACGCTGTTC AAGCCTCTGA AGATCCGTGG CGTCGAG CTCTCCAAC CGCTTTGGC- 

SEQ -23 C AmUVTTATTT ACTCCTCTTA AAATTAGAGG AGTGGAG TTCCATAAC AGAATGTTT- 

SEQ 25 CTCC GACGCTCTTC .AAGCCCCTCC GCATCCGCGA CGTCACC ATCAACAAC CGCATCTGG- 

SEQ 26 CTCC GACGCTCTTC AAGCCCCTCC GCATCCGCGA CCTCACC ^ATCAACAAC CGCATCTGG- 

SEQ 28 AAGCTATTC ACACCCATCA CCATCCGCGG CGTCACA TTCCCAAAC CGCCTCTTC- 

SEQ 29 AAGCTATTC ACACCCATCA CCATCCGCGG CGTCACA TTCCCAAAC CGCCTCTTC- 

SEQ 32 

SEQ 34 GG AAGCCTATTC TCTCTTATTA AAATAAGAAA CCTGACT CTTCAAAAC CGGATTTTT- 

SEQ 36 ^AGGTTTTC ACACBCATCA CCATCCGAGG CGTCACA TTCCCAAAC CGTCTCTTT- 

SEQ 37 ^MGTTTTC ACACBCATCA CCATCCGAGG CGTCACA TTCCCAAAC CGTCTCTTT- 

SEQ 39 CCTCA AGATCCGAGG TCTTACC CTCCAGAAC CGTATTATG- 

SEQ 41 

SEQ 43 ^ACGC CGCTCAAATA CCCCGTCTCG GGGCGGTCG GCGCCCAAC CGGTTCCTC- 

SEQ 82 — A ATACTCTACC AAAGGTCTTT ACACCCATCA AGATTCGCGG CATGACC ATGCCCAAC CGTATCTGG- 

SEQ 84 CCAGAGG6TG TCAAAAAACC GGCTTTGTTC CAAACGTTGA CATTGCCCTT TGCTGCACCG GAACAGGCGG GTAA6ATGAC CTTCAAGAAC CGCATCATT- 

501 511 521 531 541 551 561 571 581 591 



SEQ 1 TAAGTCCGTT TGCCCTTGCT GATATCGACG AAAGCTAATC CCCCGTCAG 7 CTCGC GCCCCTCTGC 

SEQ 2 : CTCGC GCCCCTCTGC 

SEQ 4 TGAGTGCAGT CCAGGCAATT ATGCTATCCA TCCTATGCGA GCCCTTGCAT TGGAACAGCC GCTTACAGGG AATGATAATG AGTAGCTATC GCCACTCTGC 

SEQ 5 CTATC GCCACTCTGC 

SEQ 7 GTGTC GCCCATGTGC 

SEQ 9 CTCGC GCCCCTCTGC 

SEQ 11 — — — — — — — — GTATC TCCAATGTGT 

SEQ 13 GTTTC ACCAATGTGC 

SEQ 15 GTTGC GCCCATGTGC 

SEQ 17 AAAGC CGCCATGGCC 

SEQ 18 ^AAAGC CGCCATGGCC 

SEQ 20 • GTCTC GCCCATGTGC 

SEQ 21 GTCTC GCCCATGTGC 

SEQ 25 GTCAG CCCCATGTGC 

SEQ 26 GTCAG CCCCATGTGC 

SEQ 28 CTTGC CCCTCTCTGC 

SEQ 29 CTTGC CCCTCTCTGC 

SEQ 32 

SEQ 34 GTCTC CCCAATGTGT 

SEQ 36 CTTGC CCCTCTCTGT 

SEQ 37 CTTGC CCCTCTCTGT 

SEQ 39 — — — — — — -, TTGAG GGGGCTCTGC 

SEQ 41 

SEQ 43 ^AACGC GGCCATGTCG 

SEQ 82 ' GTCAG CCCCATGTGC 

SEQ 84 GTCTC TCCCATGTGC 



601 511 621 631 641 651 661 ' 671 681 691 

*•****★***_. __________ ********** ********** * _ 

SEQ 1 CAATACTCCG CC " C2M3GAC6 GCCACATGAC CGAC TACCACATCG CCCATCTGGG TGGGATCGCC CAACGCGGAC 

SEQ 2 CAATACTCCG CC CASGACG GCCACATGAC CGAC TACCACATCG CCCATCTGGG TGGGATCGCC CAACGCGGAC 

SEQ 4 CAATACTCAG CC GACGATG GACACATGAC TCCC TGGCATATGG CACATCTTGG AGGGATTGCC CAGCGAGGGC 

SEQ 5 CAATACTCAG CC GACGATG GACACATGAC TCCC TGGCATATGG CACATCTTGG AGGGATTGCC CAGCGAGGGC 

SEQ 7 ATGTACTCCT GCGAGTCGGA CCCGTCGTCT CCCCACGTCG GCGCCCTAAC A3U^.C TACCACCTGG CCCATCTGGG CCACCTCGCC CTCftAAGGCG 

SEQ 9 CAGTACTCCG CA GAAGACG GCCACATGAC AGAC TACCACATCG CGCACTTGGG AGGTATTGCC CRGCGCGGCC 

SEQ 11 CAATATTCTG CT GATTATAATT TTGAAGCAAC TCCA TACCATTTAA TCCATTATGG TTCATTAGTG AATCGTGGGC 

SEQ 13 ATGTATTCAT CG TCA CCAACTGACA ATCAAGCCAC TCTG TTTCATTTTG TTCATTATGG ATCATTTGCT GTACGTGGflC 

SEQ 15 ACCTACTCTG CC GACGATG GCCACATGAC CGAC TGGCACCTTG TCCACCTGGG CTCCTTCGCC CTCCGCGGTG 

SEQ 17 GAACAAATGG GC TTCGGCA ACCACCTGCC CAAC CCCGAACTCG CCGCCGTCTA CGCCACCTGG GCCCGCGGCG 

SEQ 18 GAACAAATGG GC TTCGGCA ACCACCTGCC CAAC CCCGAACTCG CCGCCGTCTA CGCCACCTGG GCCCGCGGCG 

SEQ 20 ACCTACTCAG CC GACGATG GCCACCTGAC CGAC TTCCACTTGG TGCACCTGGG CCAGTTCGCC CTGCACGGCA 

SEQ 21 ACCTACTCAG CC • GACGATG GCCACCTGAC CGAC TTCCACTTGG TGCACCTGGG CCAGTTCGCC CTGCACGGCA 

SEQ 23 ACTTATTCCG CT GACCAAGAAG GGCATTTGAC AGAT TTTCACCTAG TACATCTTGG AGCGATGGGA ATGCGTGGGC 

SEQ 25 CAGTACTCCG CC GACTUVTG GCCACGCGiiC CGAC TACCACCTCG TCCACCTGGG CCAGTTCGCC CTGCACGGCG 

SEQ 26 CAGTACTCCG CC GACAATG GCCACGCGAC CGAC TACCACCTCG TCCACCTGGG CCAGTTCGCC CTGCACGGCG 

SEQ 28 CAATACTCCG CC ^AflAGATG GTTMGCCAC TGAT TGGCACTTGA CTCACCTCGG GGGAATAATC CAAAGliGGCC 

SEQ 29 CAATACTCCG CC ^AAAGATG GTTATGCCAC TGAT TGGCACTTGA CTCACCTCGG GGGAATAATC CAAAGAGGCC 

SEQ 32 

SEQ 34 CAATATTCAG CA AAGGATG GTGTCATGAC CCCC TGGCACAAAC AACACCTGGG CAGCTTCGCA GCACGAGGTC 

SEQ 36 CAATACTCCG CC RAAGKTG GATATGCTAC TGAT TGGCACTTGA CTCATCTCGG AGGCATTATC CAACGAGGCC 

SEQ 37 CAATACTCCG CC AAAGATG GATATGCTAC TGAT TGGCACTTGA CTCATCTCGG AGGCATTATC CAACGAGGCC 

SEQ 39 CAGTACTCTG CT CCCGACG GACACTACAC AATG— TGGCATCACA CCCACATGGG CGGCATCATC CAACGCGGTC 

SEQ 41 

SEQ 43 GAGGGCCTGG CG ACGTT TGACGAGGCG GACCCGTCCA AGCGCGGCAT CCCGACGGAG CAGCTGGTGC AGCTGTACCG GCGCTGGGGC CAGGGCGAGT 

SEQ 82 CAATAO^TG CC CGTGACG* GCTTTCAGCA GCCT TGGCACTTTG CCCACTACGG CGG2W:TGGCC CAACGTGGCC 

SEQ 84 CAGTACTCTG CG ^AACAATG GTCTTCCTAC TCCG TACCftCATTG CGCATTTGGG ATCGTTTGCC CTGCACGGTG 

701 711 721 731 741 731 761 771 781 791 



********** *****——. _ — — — — - — _- _ 

SEQ 1 CCGGCCTGAT GCTGATTGAG GCGACCGCCG TCCAGCCCGA a GGCCGC ATCACCCCTC AGGATGTCGG TCTGTGGAAG GflCTCC CA 

SEQ 2 CCGGCCTGAT GCTGATTGAG GCGACCGCCG TCCAGCCCGA A GGCCGC ATCACCCCTC AGGATGTCGG TCTGTGGAAG GACTCC CA 

SEQ 4 CAGGATTCTT GATGGTCGAG GCAACAGCAG TCGAACCGGA A GGCAGG ATCACCCCGC AGGACCTGGG ACTATGGAAA GACTCG CA 

SEQ 5 CAGGATTCTT GATGGTCGAG GCAACAGCAG TCGAACCGGA A GGCAGG ATCACCCCGC AGGACCTGGG ACTATGGAAA GACTCG CA 

SEQ 7 CAGGCCTCGT CTTCATCGAA GCGACCGCCG TGCAGCCCAA C GGGCGC ATCTCCCCCA ACGACTCGGG CCTCTGGCAG GACGGCACCA CCTCGGAACA 

SEQ 9 CCGGTCTCAT GATGATCGAG GCAACCTCCG ' TCTCACCTGA A GGCAGA ATCACGCCGC AGGACGTCGG TTTATGGAAG GACTCG CA 

SEQ 11 CAGGTATCAC CATTGTTGAA AGCACGGCTG TTTCTCCTGA G GGTGGA TTATCACCTC ATGATTTAGG AATCTGGAAG GATGAA CA 

SEQ 13 CAGCATTAAT CATTTTAGAG AGTATCTTTG TGTCCGAAAA T TCCGGA TTATCCATTC ATGATTTAGG TCTTTGGAAT GATGAT CA 

SEQ IS TCCCCCTCAC CATCTTCGAG GCCACCGGCG TCCTCCCCAA C GGCCGC ATCACCCCCG AGTGCTCTGG TCTCTGGCAG GACTCC CA 

SEQ 17 ACTGGGGCCT GATTCTCACC GGCAACGTCC AAGTCGACCA CGCGCACAAG GGCGACGCCC ACGACATCAG CCCCAACCAC CCCGGCACCA CGCCCGAGCA 

SEQ 18 ACTGGGGCCT GATTCTCACC GGCAACGTCC AAGTCGACCA CGCGCACAAG GGCGACGCCC ACGACATCAG CCCCAACCAC CCCGGCACCA CGCCCGAGCA 

SEQ 20 CGGCCCTGAC CATTGTCGAG GCCACATCCG TCACGCCCAA C— GGACGC ATCTCGCCCG AGGACAGCG6 CCTGTGGCAA GACAGC CA 

SEQ 21 CGGCCCTGAC CATTGTCGAG GCCACATCCG TCACGCCCAA C GGACGC ATCTCGCCCG AGGACAGCGG CCTGTGGCAA GACAGC CA 

SEQ 23 CTGGCCTTGT AATGGTAGAA GCGACAGCGG TTTCCCCAGA 6 GGACGA ATTTCMCTA ATGATTCAGG ATTATGGATG GAGTCG CA 

SEQ 25 CCGCCCTGTC GATGGTCGAG GCGACCGCCG TCGAGGCTCG T GGCCGC ATCTCGCCCG AGGATGTCGG TTTGTGGCAG GACTCG CA 

SEQ 25 CCGCCCTGTC GATGGTCGAG GCGACCGCCG TCGAGGCTCG T GGCCGC ATCTCGCCCG AGGATGTCGG TTTGTGGCAG GACTCG CA 

SEQ 28 CCGGATTGTC CATGGTGGAG GCTACCGCTG TACAAAACCA C— GGTCGC ATCACACCTC A6GATGTTGG TCTGTGGG3«. 6ACGGC— — CA 

SEQ 29 CCGGATTGTC CATGGTGGAG GCTACCGCTG TACAAAACCA C GGTCGC ATCACACCTC AGGATGTTGG TCTGTGGGAA GACGGC CA 

SEQ 32 ZZ 

SEQ 34 CGGGTCTCAT TGTCACAGAA GTCAACGCAG TTTCACCAGA G GGACGA ATCAGTCCTG AGGATGCAGG CATCTACGAT GATGGG CA 

SEQ 36 CGGGACTGTC CATGGTAGAG GCCACCGCTG TTCAAAACCA C GGTCGC ATCACCCCTC AGGACGTTGG TCTCTGGGAA GATGGA CA 

SEQ 37 CGGGACTGTC CATGGTAGAG GCCACCGCTG TTCAAAACCA C GGTCGC ATCACCCCTC AGGACGTTGG TCTCTGGGAA GATGGA CA 

SEQ 39 CCGGACTCAC CTGCGTTGftA GCCACAGCCG TGACTCCTCA A GGTCGC ATCACGCCTG AAGACGTCGG TATCTGGCAA GATTCT CA 

SEQ 41 ' 

SEQ 43 GGGGCCAGAT CCAGACGGGC AACGTCATGA TCGACCCGGA GCACCTCGAG GCCCCGGGCA ACATGGTGGT GCCGCGCGAC GCCGAGCCCT CGGGCGAGCG 

SEQ 82 CTGGCCTCAT CATGCTAGAA GCT2\CCGCAG TTCAAGC2W:G T GGCCGT ATCACflCCTG AAGATTCTGG CATCTGGCTA GACTCT . CA 

SEQ 84 TGGGAAACGT CATGGTCGAA GCATCTGGTG TTGASCCAGA G GGGAGG ATCACCCCTC AGGACCTGGG TATTTGGTCG GAACAG CA 

801 811 821 831 841 851 861 871 881 891 ^ 



******-^ 



.******* ********** 



SEQ 1 GATCGCCCCG ^ATGCGCC GGGTCATCGA CTTCGTGCAC AGCCAGGGC- CAGAAGATCG GCGTG CAGCTT GCCCATGCCG GCCGGftAAGC 

SEQ 2 GATCGCCCCG ATGCGCC GGGTCATCGA CTTCGTGCAC AGCCAGGGC- CAGAAGATCG GCGTG CAGCTT GCCCATGCCG GCCGGAAAGC 

SEQ 4 GATTGAGCCA TTGAGCC GCGTGATCGA GTTTGTCCAC AGTCAGAAC- CAGCTTATCG GCGTG CAGATC GCACACGCAG GTCGCAAGGC 

SEQ 5 GATTGAGCCA TTGAGCC GCGTGATCGA GTTTGTCCAC AGTCAGAAC- CAGCTTATCG GCGTG CAGATC GCACACGCAG GTCGCAAGGC 

SEQ 7 ATTCCTGGGG CTGAAGC GGGTCGTCGA GTTCATGCAC GCACAGGGC- GCCAAGGTCG GGATC CAGCTT GCGCATGCGG GCCGGAAAGC 

SEQ 9 GATTGCGCCC ^ATGRAGC GCGTGATCGA CTTCGTGCAC TCGCAGTCC- CAGAAGATTG GCGTG CAGATT GCCCACGCCG GCCGCAAGGC 

SEQ 11 AGCAGAGAAA TTGAAAC CAATTGTCGA TTACGCTCAT TCTCAAAAG- CAATTAATTG CCATC CAATTG GGCCATGGTG GTAQAAAAGC 

SEQ 13 AGCTCACAGT TTACGGA AAATTGTTGA TTTTATTCAT GATCAAGAC- GGAATTTGCT GTATA CAATTG AATCACGCTG GGCGAAAGAT 

SEQ IS GATTGCGCCC CTGAAGC GCATCGTCGA CTACATCCAC TCCCAGGGC- CAGAAGGCCG GTATC CAGCTT GCCCACGCCG GCCGCAAGGC 

SEQ 17 GACCGTCACG GCCTTCAAGG CCTGGGCGGA CGCCGCGCGC CTGAATGGC- CAGTCCAAAA CGCCTGTGGT CGTGCAGATC AACCACCCTG GTCGCCAGAG 

SEQ 18 GACCGTCACG GCCTTCAAGG CCTGGGCGGA CGCCGCGCGC CTGAATGGC- CAGTCCAAAA CGCCTGTGGT CGTGCAGATC AACCACCCTG GTCGCCAGAG 

SEQ 20 GATCGCTCCT CTGCGCC GCATCGTCGA CTACGTGCAC AGCCAGGGC- CAAAAGATCG CCATC CAWITG GCTCATGCCG GCCGCAAGGC 

SEQ 21 GATCGCTCCT CTGCGCC GCATCGTCGA CTACGTGCAC AGCCAGGGC- CAAAAGATCG CCATC CAACTG GCTCATGCCG GCCGCAAGGC 

SEQ 23 AATGAAGCCG TTACGAA GAATTGTTGA ATTTGCTCAT TCGCAAAAT- CAA2VAAATTG GGATT CAATTG GCGCATGCTG GTAGRAAGGC 

SEQ 25 GATTGCGCCC CTGAAGC GCATCGTCGA CTTTATCCAC TCGCAGAAC- CAGGTCGCGG CCATC CAGCTC GCCCACGCCG GTCGCZUWSGC 

SEQ 26 GATTGCGCCC CTGAAGC GCATCGTCGA CTTTATCCAC TCGCAGAAC- CAGGTCGCGG CCATC CAGCTC GCCCACGCCG GTCGCAAGGC 

SEQ 28 GATCGAGCCT CTGAAGC GCATCACCAC TTTCGCGCAC AGTCAGAGC- CAGAAAATTG GTATC CAGCTG TCGCATGCGG GTCGCAAGGC 

SEQ 29 GATCGAGCCT CTGAAGC GCATCACCAC TTTCGCGCAC AGTCAGAGC- CAGAAAATTG GTATC CAGCTG TCGCATGCGG GTCGCAAGGC 

SEQ 32 — — . — 

SEQ 34 GCTTGGACCT CTCCGGG ATATTGTGGA CTTTGTACAC AGCCAGGGC- GCCAAGATTG CTATT CAGATA GGTCATGCTG GGAGAAAAGC 

SEQ 36 AATCGAGCCC T— TTGAAGC GCATCACTM TTTTGCCC2M: AGCCAAAGCW CAGAAGATTG GTAT TCAGCTC TCGCACGCTG GTCGTAAGGC 

SEQ 37 AATCGAGCCC TTGAAGC GCATCACTAC TTTTGCCCAC AGCCAAA6C- CAGAAGATTG GTAT TCAGCTC TCGCACGCTG GTCGTAAGGC 

SEQ 39 GATCGAGCCT C— TTGCCAA GGTCGTC-GA GTTTGCCCAC TCGCAGAAC- CAGAAGATCA TGATT CAGTTG GCGCATGCGG GCCGGAAAGC 

SEQ 41 

SEQ 43 CTTCGACATG TTTTCCAAGC TCGCCGCCGC CGCCAAGGAG CACGGCAGC- CTC-ATCGTC GCG CAGGTC GGACACCCCG GTCGCCAGGC 

SEQ 8 2 TGTTGAGGGA CTGCGAA AGCACGTCGA GTTTGCCCAT GCCAACAAC- TCTCTTATCG GTATC CAGATT GGCCATGGTG GTCGCAAGGC 

SEQ 84 TCGGGATGCA CACAAGG CGCTGGTGTC GGTGCTCAAG TCCTTCACG- GATGGTCTGG GTGTA GGGCTG CZUICTGGCGC ATGCGGGAAG 



901 911 921 931 941 951 961 971 981 991 

********** *********^ _ _ — - — ■ — » ********** 

SEQ 1 CACCACCGTT GCGCCCTG6A TCTCA TTCTCGGCC ATCGCGACGG AGAAGGTCGG CGGATGGCCG 

SEQ 2 CACCACCGTT GCGCCCTGGA TCTCA TTCTCGGCC ATCGCGACGG AGAAGGTCGG CGGATGGCCG 

SEQ 4 CAGCACCGTC GCGCCATGGC TCTCG GCCAACGAT ACCGCCTCCG AGAAGATGGG CGGCTGGCCA 

SEQ 5 CAGCACCGTC GCGCCATGGC TCTCG GCCAACGAT ACCGCCTCCG AGAAGATGGG CGGCTGGCCA 

SEQ 7 GAGTGCCGTT GCGCCGTGGC TGGCG GCGC AGGCGGGCAA GTCGAGTCTG AAGGCGGATG AGAGCGTTGG CGGGTGGCCC 

SEQ 9 TTCGAACATC GCCCCCTGGC TCATG AA CAAGGGCATC GTCGCGACGG AGAAGGTCGG TGGCTGGCCG 

SEQ. 11 TTCTGGTCAG CCCTTATTTT TGCAC -TTGGAACAA GTTGCAGATA AATCTGTCAA TGGGTTTGCC 

SEQ 13 TGTTGAAGGG GTACCATTCC AACAA —ATACAACA TGGTTGGCAA 

SEQ IS CTCCACCAAG GCCCCCTGGC ACTAC — CAGCGCGG CAAGAGCGAG CTTGCCGGCC CCGAGCAGGG TGGCTGGCCG 

SEQ 17 TCCGATGGGC GCGGGCACGC GGGGA — CTGT GGGAGAAGGC GGTGGCGCCC TCGCCGGTGC CGTTGGTGTT GGGAGAGGCG 

SEQ 18 TCCGATGGGC GCGGGCACGC GGGGA CTGT GGGAGAAGGC GGTGGCGCCC TCGCCGGTGC CGTTGGTGTT GGGAGAGGCG 

SEQ 20 CAGCACAAAG GCCCCCTGGC ACGACTCCTT CACCCCCAGC GGCGAGTATA AGCCGAGAGA GGGCTTACAG GTC6TCGGAC CCGASTATGG CGGCTGGCCT 

SEQ 21 CAGCACAAAG GCCCCCTGGC ACGACTCCTT CACCCCCAGC GGCGAGTATA AGCCGAGAGA GGGCTTACAG GTCGTCGGAC CCGAGTATGG CGGCTGGCCT 

SEQ 23 TAGCACCACT GCTCCTTATC GAGGA TACACA GTTGCGACTG AAGCTCRAGG TGGGTGGGAG 

SEQ 25 TAGCACCCTG GCACCGTGGA TCACC GAGGCTCG CGGCAAGGCG CTGGCTCAGG AGAfiCGAGAA CGGGTGGCCC 

SEQ 26 TAGCACCCTG GCACCGTGGA TCRCC GAGGCTCG CGGCAAGGCG CTGGCTCAGG AGAGCGAGAA CGGGTGGCCC 

SEQ 28 CAGTTGCGTA TCTCCCT6GC TAAGC GTAAATGCT GTCGCGGCGG AAGAAGTGGG TGGCTGGCCA 

SEQ 29 CAGTTGCGTA TCTCCCTGGC TAAGC GTAAATGCT GTCGCGGCGG AAGAAGTGGG TGGCTGGCCA 

SEQ 32 

SEQ 34 GAGCACAGTC GTACCGTGGC TGGAC CGCAAGAAC ACTGCTTTTA 

SEQ 3 6 TAGTTGTGTA TCTCCGTGGT TGAGC ATCAACGCT GTTGCCGCTA AGGAAGTCGG TGGCTGGCCA 

SEQ 37 TAGTTGTGTA TCTCCGTGGT TGAGC -ATCAACGCT GTTGCCGCTA AGGAAGTCGG TGGCTGGCCA 

SEQ 39 GAGCACTGTG GCACCATGGT TAAGC GGCGGCGAT GTTGCTGGTG AGGACGTCAA CGGATGGCCA 

SEQ 41 GACT GCCGAGTAAA CGCGCCGGCA AGGAGGCGGG AGGATGGCCG 

SEQ 43 CCGCGGCAGC GTCCAGCAGC ACCCC ATTAGCGC CAGCGACGTG CAGCTTAAGC AGGAGATG 

SEQ 82 CTCCTGCGTT GCTCCTTGGT TAGAC GCCGGACTT GCCGCTGAAA AGGCCGCTGG TGGATGGCCC 

SEQ 84 GAAGGCCTCG GACTGGTCAC CTTTC TACC GCGGAGAAAA GAAGCAAAAG TTTGTGACGC AGGAGGAAGG TGGCTGGCCG 

1001 1011 1021 1031 1041 1051 1061 1071 1081 1091 



********** ********** ********^^ ^********* ********** *****^^„^^ 

SEQ 1 GACCCGCGTC AAAGGGCCCG GCGATATC— ' CCCTTTGCG GAGCCCTTCG CCAAGCCCAA GGCCATGACG 

SEQ 2 GAC-CGCGTC AAAGGGCCCG GCGATATC -CCCTTTGCG GAGCCCTTCG CCAAGCCCAA GGCCATGACG 

SEQ 4 GGC-CGCGTC AAAGGCCCGA CAAATGTG CCCTTCACC GTTAAGAACC CTGTGCCGAA GGAGATGACC 

SEQ 5 GGC-CGCGTC AAAGGCCCGA CAAATGTG — -CCCTTCACC GTTAAGAACC CTGTGCCGAA GGAGATGACC 

SEQ 7 GCG-GATGTG GTGGGTCCGT CGGGCGGG GAGGAGC ATATCTTTAG TCCCGAGGAG GATGCGTATT GGGTGCCGCG GGCGCTGAGC 

SEQ 9 GAT-CGTGTG ATCGGCCCGT CCAGCGTG— * CCCTTCCAC GA6ACTTTCC CCACCCCCAA GGCCATGACC 

SEQ 11, GAC-AAAGCA GTTGCTCCTT CTGCATTG— — -GCATTC- AGACCAAAT GGTAATTTAC CTGTTCCTAA TGAGTTGACC 

SEQ 13 GAA-CATTGT GTGGGGCCAT CTACTGAG CCATTTAGT GATTCACACA ATACACCACG AGAATTGACT 

SEQ IS GAG-A21CGTC TGGGCCCCCA GCGCCATC AG CTACAACGA6 GAGACCTTCC CCTTCCCCAA GGAGATGACC 

SEQ 17 TTT-GTGCCT CGCTTGTTGT CGAAAGTG CTTTTCG GCACGCCGCG GGAGCTGACG 

SEQ 18 TTT-GTGCCT CGCTTGTTGT CGAAAGTG CTTTTCG GCACGCCGCG GGAGCTGACG 

SEQ 20 GAT-GACGTC TGGGCCCCGA GCGCCATC CCGTTCTCG GAGCACTTTC CGAACCCCAA GGAGATGACC 

SEQ 21 GAT-GACGTC TGGGCCCCGA GCGCCATC CCGTTCTCG GAGGACTTTC CGAACCCCAA GGAGATGACC 

SEQ 23 AAT-GATGTT TATGGACCAA ATGAAGAC AGGTGGGAC GAAAACCACG CTCAACCTCA TAAGTTAACT 

SEQ 25 GAC-GACGTT GTGGCTCCCA GCGCGATT CCTTACACC AAGQACTGGG CCACACCGCG TGAGTTGACT 

SEQ 26 GAC-GACGTT GTGGCTCCCA GCGCGATT CCTTACACC AAGGACTGGG CC2VCACCGCG TGAGTTGACT 

SEQ 28 GAC-AMATC GTTGCTCCCT CGGCCATC GC ACAAGAAAAT GGTGTGAACC C21GTTCCCAA GGCTTTCMG 

SEQ 29 GAC-AATATC GTTGCTCCCT CGGCCATC GC ACAAGAAAAT GGTGTGAACC CAGTTCCCAA GGCTTTCACG 

SEQ 32 

SEQ 34 

SEQ 35 GAC-AACATT GTTGCTCCTT CTGCCATC — GC ACAAGAAGCT GGCGTGAACC CTGTTCCCAA GGCCTTCACC 

SEQ 37 GAC-AACATT GTTGCTCCTT CTGCCATC GC ACAAGAAGCT GGCGTGAACC CTGTTCCCAA GGCCTTCACC 

SEQ 39 CAG-GATGTC TGGGCGCCCA GTGCGATT CCATGGAAC GAGAAGCACG CTGTCCCAAA GGAGATGTCG 

SEQ 41 GAG-GATGTT GTGGGTCCGT CGG6TGGGGA 6GACTTTACG TGGGATGAGA GGTCCTCGAG CGACCCTAGT GGAGGCTACT ATGCGCCGAG AGAGTTGTCG 

SEQ 43 TTTGGG TCAAAGTTTG GCGTGCCCAG GCCCGCTACC 

SEQ 82 GAT-6ACGTT GTCGGACCTA GCAACGAG-^ CCTTTTGCT CCTGGCTACC CTACCCCCCG TGCTATTACT 

SEQ 84 GAT-CGTGTC GTCGCTCCTT CGGCCATC GCATATGC6 CAAGGTCACG TTACCCCTCG AGCTCTCACG 

1101 1111 1121 1131 1141 1151 1161 1171 1181 1191 



SEQ 1 CTGGATGA-G ATCGAGCAGT TCAAGAAGGA CTGGGTGGCG GCCACGlUySC GCGCCATGGC CG CCGGT GCGGACTTTG TCGAGATTCA CAATGCGCAT 

SEQ 2 CTGGATGA-G ATCGAGCAGT TCAAGAAGGA CTGGGTGGCG GCCACGAAGC GCGCCATGGC CG CCGGT GCGGACTTTG TCGAGATTCA CAATGCGCAT 

SEQ 4 AAGCAGGA-T ATCGAGGATC TGAAGACCGC CTGGGTGGCC GCTGTCAAAC GGGCTGTTAA GG CCGGA GCCGACTTTA TCGAGATCCA CAATGCGCAT 

SEQ 5 AAGCAGGA-T ATCGAGGATC TGAAGACCGC CTGGGTGGCC GCTGTCAAAC GGGCTGTTAA GG CCGGA GCCGACTTTA TCGAGATCCA CAATGCGCAT 

SEQ 7 ACGGCCGA-G GTCCGTCAGG TGGTGGCGGC GTTTGCGAAG AGCGCGCGGC TAGCGGTGCA GG CTGGG GTGGATGTTA TCGAGATCCA TGGGGCGCAT 

SEQ 9 AAGGACGA-C ATCGAGCAGT TCA2\GCGCGA CTGGTTTGAT GCGTGCAAGC GGGCCATTGC CG CTGGC GCGGACTTCA TCGAGATCCA CAATGCCCAC 

SEQ 11 AAAGATGA-A ATCAAACGTG TTGTTAAGGA TTTTGGTGCT GCTGCTAGAA GAGCTGTTGA AATCAGTGGC TTTGATGCAG TTGAGATTCA TGGTGCTCAT 

SEQ 13 GTTAATGA-A ATAAATTCAA TTGTGGAAGA CTTTGCCAAT GCAGCTTGGC GGGCTGTGGA AATCTCAAAA TTCGATGCCA TTGAAATACA TTGTGCTAAT 

SEQ IS GTCGA6CA-G ATCCAGGAGC TCGTCGAGGC CTGGAAGGC6 TCTGCCCA6C GTGCCCTCAA GG CCGGC TTCGACCTCA TTGAGATCCA CGCCGCCCAC 

SEQ 17 GTTGCGGA-G ATCAAGGATA TCGTGCAAAA GTTTGCGGTG ACGGCGAGGA TCACGGCCGA GG CCGGG TTCAATGGCG TCGAGATCCA TGCGGCGCAT 

SEQ 18 GTTGCGGA-G ATCAAGGATA TCGTGCAAAA GTTTGCGGTG ACGGCGAGGA .TCACGGCCGA GG CCGGG TTCAATGGCG TGGAGATCCA TGCGGCGCAT 

SEQ 20 GTTGAGGA-G ATTGAGGGAC TCGTCACCAG CTTTGTGGAC GCTGCCAAGC 6TGCCATCGA GG CCGGC GTCGACATTA TTGAGATTCA CGGCGCTCAC 

SEQ 21 GTTGAGGA-G ATTGAGGGAC TCGTCACCAG CTTTGTGGAC GCTGCCAAGC GTGCCATCGA GG CCGGC GTCGACATTA TTGAGATTCA CGGCGCTCAC 

SEQ 23 GAAAAGCA-A TATGATGAAT TAGTGGATAA GTTTGTTGTT GCTGCCAAGC GTGCAGTTGA AA TAGGT TTTGATGTAA TTGAAATTCA TGGCGCTCAT 

SEQ 25 ACCGAGGRRG TCGAGGGTCT GGGTGAAGAA " GTTCGCCGAG TCGGCCAAGA GGTCAAATCG A GCTGGT TTTGACGTCA TTGAGATCCA CGCCGCTCA- 

SEQ 26 ACCGAGGR-G TCGAGGGTCT GGGTGAAGAA GTTCGCCGAG TCGGCCAAGA GGTCAAATCG AG CTGGT TTTGACGTCA TTGAGATCCA CGCCGCT 

SEQ 28 AAGGAGGA-T ATAGAGCAAC TCAAGAGCGA CTACGTGGAA GCGGCAAAAC GAGCCATCCA TG CTGGT TTCGATGTTA TCGAAATTCA TGCAGCTCAT 

SEQ 29 AAGGAGGA-T ATAGAGCAAC TCAAGAGCGA CTACGTGGAA GCGGCAAAAC GAGCCATCCA TG CTGGT TTCGATGTTA TCGAAATTCA TGCAGCTCAT 

SEQ 32 

SEQ 34 

SEQ 36 AAGGAGGA-T ATCGAGGAAC TCAAGAATGA CTTTCTGGCT GCAGCMAAAC GAGCCAWCCG CGC TGGT TTTGATGTCA TCGAGATCCA TGCAGCTCAT 

SEQ 37 AAGGAGGA-T ATCGAGGAAC TCAAGAATGA CTTTCTGGCT GCAGCMAAAC GAGCCAWCCG CGC TGGT TTTGATGTCA TCGAGATCCA TGCAGCTCAT 

SEQ 39 TTGGATGA-T ATCGAGGCTT TCAAGAAGGC GTTT6GAGAG GCGGTCAAGC GGGCATTGAA GGC TGGA TTTGATGTTA TTGAGATTCA CAATGCTCAC 

SEQ 41 GTCAGAGA-G ATCAAGGAGA TGGTCCAAGA CTGG6CGACA GCAGCGAAAA GGGCGGTGAA AGC 66GC GTGGATGTAA TCGAAATCCA CGGCGCGCAT 

SEQ 43 AAGGAGGA-T ATTAAGGCGG TGATTGAGGG TTTTGCCCAC 2VCGGCCGAGT ACCTTGAAAA GGC CGGT TTCGACGGTA TCGAATTGCA CGCCGCCCAC 

SEQ 82 CTTGAAGA-G ATTGAACAGT TGAAGGAGGA CTTT6TTTCC GGTGTTCGTC GAGCGGTTGA AG CAGGA TTTGACACTA TCSACTTCCA TTTCGCTCAC 

SEQ 84 ACCGAGGA-C ATCAACAAGT TGCAAGACAA ATTCGTTCAG TCGGCACGAT GGGCGTTTGA AG CTGGG TATGACTACG TCGAACTTCA CAGCGCTCAC 



1201 1211 1221 1231 1241 1251 1261 1271 1281 1291 



SEQ 1 GGATACCTGC TGTCGTCATT CCTCTCGCCG GCCGCCAAC 

SEQ 2 GGATACCTGC TGTCGTCATT CCTCTCGCCG GCCGCCAAC 

SEQ 4 GGCTATCTTC TGATGTCGTT CCTCTCCCCT GCGGTCAAC 

SEQ 5 GGCTATCTTC TGftlGTCGTT CCTCTCCCCT GCGGTCAAC 

SEQ 7 GGCTATCTCA TCAACGAGTT CCTGAGCCCG GTCACGAAT T 

SEQ 9 GGCTATCTTC TCTCGTCTTT CCTATCACCG TCTTCCAAC ' 

SEQ 11 GGTTATTTGA TTAATGAGTT CTATAGTCCT ATTTCAAAC- 

SEQ 13 GGATGTTTJUV TACACCAATT TTTAAGTAAA TTGACAAAC- 

SEQ 15 GGCTACCTCA TTTCCGAGTT CTTGAGCCCC ATCTCCAAC- 

SEQ 17 GGATACCTGT TGGCGCAGTT CTTGAGCAAG AAGACAAAC- 

SEQ 18 GGATACCTGT TGGCGCAGTT CTTGAGCAAG AAGACAAAC 

SEQ 20 GGTTACCTGA TCACCGAGTT CCTTTCGCCG CTATCAAACG TAAGTGGAGA TACTTTGTGT GGGGCTGTGC GCATACTCCC TCGGGTGTGA CTTCTATTAA 

SEQ 21 GGTTACCTGA TCACCGAGTT CCTTTCGCCG CTATCAAAC 

SEQ 23 GGTTATCTTA TATCGTCflAC AGTTAGTCCT GCCACTAAT 

SEQ 25 

SEQ 28 GGATATCTAC TGCATCAATT CTTGAGTCCG GTAAGCAAT 

SEQ 29 GGATATCTAC TGCATCAMT CTTGAGTCCG GTAAGCAAT 

SEQ 36 GGATACKTGC TTCACCAGTT CTTGAGTCCA GTCAGTAAC- — — • 

SEQ 37 GGATACKTGC TTCACCAGTT CTTGAGTCCA GTCAGTAAC . 

SEQ 39 GGATACCTGC TCCACGAATT CATCTGCCTG AGAGCAACA 

SEQ 41 GGGTACCTCA TCCACGAATT CCTCTCACCC ATTACCAAC 

SEQ 43 GGTTACCTGC TGGCCCAATT CCTGTCCGAA AC2U^CC2\AC 

SEQ 82 GGTTATCTTG TTTCCAGCTT CCTGTCCCCT GCCACCAAC 

SEQ 84 GGATACCTGA TGCACTCGTT CCTGAGCCCG TTGACCAAT 

1301 1311 1321 1331 1341 1351 1361 1371 1381 1391 



SEQ 1 ^AACCGCAC GGACCAGTflC GGCGGGTCGT TCGAGAACCG CATCCGGCTG TCTCTCGAGA TTGCGCAGTT GACTCGGGAC 

SEQ 2 ^AACCGCAC GGACCAGTAC GGCGGGTCGT TCGAGAACCG CATCCGGCTG TCTCTCGAGA TTGCGCAGTT GACTCGGGAC 

SEQ 4 ACGAGAAC AGACGAGTAC GGAGGCAGTT TTGAGAATCG CATCCGGCTG AGTCTGGAGA TCGCCAAGCT CACCCGCGAA 

SEQ 5 — — — ACGAGAAC AGACGAGTAC GGAGGCAGTT TTGAGAATCG CATCCGGCTG AGTCTGGAGA TCGCCAAGCT CACCCGCGAA 

SEQ 7 AAGCGGAC GGATGCGTAC GGCGGGAGCT TTGAGAACCG GACCCGGATC GTGCGCGAGG TTGCGGCGGC TATTCGTGCG 

SEQ 9 — ACGCGCAC CGACGAGTAC GGCGGGTCGT TTGAGAACCG CATCCGGCTG TCTCTCG2\AA TCGCCCAGGT CACCCGTGAC 

SEQ 11 — — AAGAGAAC .AGATGAATAC GGTGGCAGTT TTGAAAATAG AACCAGATTT TTAAAGGAAG TTATCGATAG TGTTAAATCA 

SEQ 13 — — AAGAGAGC TGACCAATAC GGGGGCTCAT TTGAAAACAG AGTTAGATTT CTTTTACAAA TAATTGAGAA TATAA2\ACGA 

SEQ 15 CAGCGTAC GGACCAGTAC GGTGGCTCCT TCGAGAACCG CACCCGCGTT CTCCGCGAGA TCATCTCGGC CGTCCGCTCC . 

SEQ 17 ^AGGCGCGG GGATGAGTAT GGCGGGTCGG CTGAGAACAG GGCGAGGATT GTTGGGGAGA TTATTAAGGA GTGCAGGAGG 

SEQ 18 AGGCGCGG GGATGAGTAT GGCGGGTCGG CTGAGAACAG GGCGAGGATT GTTGGGGAGA TTATTAAGGA GTGCAGGAGG 

SEQ 20 CATTTTATTT CCTGGCAC6C AGAAACGGAC AGACAAGTAC GGCGGCAGCT TTGAGAACCG CACCCGGGTC CTGATCGATA TTATCAAGGC C6TCCGGGCA 

SEQ 21 AAACGGRC AGACAAGTAC GGCGGCAGCT TTGAGAACCG CACCCGGGTC CTGATCGaTA TTATCAAGGC CGTCCGGGCA 

SEQ 23 GacCGCAA TCZ^CAAGTAT GGTGGGRCAT TTGAGAAACG TATTTTGTTT CCTATGGAAG TTGTCCATTC TGTTCGTAAA 

SEQ 26 

SEQ 28 CAAAGAAC CGACGAGTAT GG- 

SEQ 29 CAAAGAAC CGACGAGTAT GG 

SEQ 32 AAC CGACGAGTAT GGTGGCAGTT TCGAGAACCG TATCAGAGTT GTCTTGGAAA TCCTTGACCT CATCCGCGCT 

SEQ 34 

SEQ 36 — — CAAAGAAC GGATGAGTAT GGTGGCAGCT TCGAGAACCG TATCAGAGTC GTCTTGGAGA TCATTG 

SEQ 37 — —CAAAGAAC GGATGAGTAT GGTGGCAGCT TCGAGAACCG TATCAGAGTC GTCTTGGAGA TCATTG 

SEQ 39 CCAGGftCC GACAAGTACG GGCGGAAGCT GGGAAAACCG CACTCGTCTG ACAATGGAAA GTCGTCGACC TTGTCCGCAG 

SEQ 41 — — — — - — CGCCGGAC AGATTCTTAC GGCGGTTCTT TCGAAAACCG TACCCGTCTA CTCATTGAAA TCGTAACAGC CGTCCGAGCC 

SEQ 43 CAGCGCAC CGACGAGTAC GGCGGCAGCC TCGAAAACCG CATGCGGCTA ATCCTCGAGG TCACGGCCGA GGTCCGCAGG 

SEQ 82 — — — — ^AAGCGTAC CGACAAGTAC GGAGGT2\GCT TCGAGAACAG AGTGCGCCTT GCTCTCGAGA TT6TCGAGGC TGCACGAGCT 

SEQ 84 CAGCGTAC CGACGAGTAC GGCGGTAGCC TGGAGAACCG CGCTCGATTT CTGCTCAACG TTGCCCGTCG AATCCGCCAA 

1401 1411 1421 1431 1441 1451 1461 1471 1481 1491 



SEQ 1 GCCGTCGGCC CTCATGTGCC C GTTTT CCTGCGCATT TCGGCCTCGG ACTGGTGCGA GGAGACCCTG CCGGA 

SEQ 2 GCCGTCGGCC CTCATGTGCC C GTTTT CCTGCGCATT TCGGCCTCGG ACTGGTGCGA GGAGACCCTG CCGGA 

SEQ 4 AATGTGCCCA AGGATATGCC T GTCTT CCTGCGGGTC TCCGCCACCG ATTGGCTGGA GGAGGTGCAG CCGAA 

SEQ 5 AATGTGCCCA AGGATATGCC T GTCTT CCTGCGGGTC TCCGCCACCG ATTGGCTGGA GGAGGTGCAG CCGAA 

SEQ 7 GTGATTCCCG AGGGGATGCC C ~ — - — CTGTT TCTGCGTATC AGCGCCACGG AGTGGTTGGA GGGTCAGCCG GTGGC— 

SEQ 9 GCCGTCGGCC CCARCGTTCC T GTTTT TCTCCGTGTC TCCGCGACGG ACTGGATCGA GGAGACCCTC CCCGA 

SEQ 11 AGTATTCCAA ACGATGTTCC A — GTGTT TTTGAGAATC TCTGCTGCTG AAAATAGTCC TGATCCA— 

SEQ 13 AAGATAGAAA CA CC G ^ATTTT CTTAAAGTTT CCAATGTCAG ATAATTGTAG TGATCCG 

SEQ 15 GTCATCCCC6 AGGACATGCC C CTCTT CGTCCGTGTC TCCGCCACCG AGTGGATGGA GTACACC 

SEQ 17 CAGGTGACTG AGGC6GTGGG TGAAGAGGAG GCGAAGAAGT TTGTGGTGGG AATCAAGCTG AACAGTGCGG ATTGGCiWSGC GGGACGCGAT GGA — A 

SEQ 18 CAGGTGACTG AGGCGGTGGG TGAAGAGGAG GCGA2\GAAGT TTGTGGTGGG AATCAAGCTG AACAGTGCGG ATTGGCAGGC GGGACGCGAT GGAAAG 

SEQ 20 GTGATTCCCG AGGAGATGCC A CTCTT CGTCCGAATC TCCGCCACCG AATGGATGGA . GTACGCCGGC 

SEQ 21 GTGATTCCCG AGGAGATGCC A CTCTT CGTCCGAATC TCCGCCACCG AATGGATGGA GTACGCCGGC 

SEQ 23 GCAATTCCAG ATAGTATGCC C TTGTT TTATAGAGTA AC6GCTACAG ATTGGTTCCC CAAAGGACAA 

SEQ 25 ^ 

SEQ 26 

SEQ 28 -: 

SEQ 29 ■ 

SEQ 32 GCCATCCCCG AAACTACACC T — — — — — — — GTCCT CGTTCGTGTC AGTGCAACTG ATTGGTTCGA GTTTGACTCT CAATTCAAAG 

SEQ 34 

SEQ 37 

SEQ 39 CATT 1 

SEQ 41 GCGATGCCCT CCAGCATGCC T CTCTT CCTCCGCCTC TCCTCTACAG AATGGATGGA AGATACCGAC ATCGGC 

SEQ 43 CGGACGAGCA AGAATTTCAT C — - — — CTCGG CATCAAAATT AACAGCGTCG AGTTCCAGGA GAAG 

SEQ 82 GTTATGCCTG AGGACATGCC C— — — TTGTT CACTCGCATC AGTGGAACTG ACTGGCTGGA GAACAACCCT GAG — 

SEQ 84 GAATTCCCCA ACAAGGGT CTCTG GGTGCGCGTC AGCTCCACCG ACTGGGCCGA CCAAGCGCAC CAA 



1501 1511 1521 1531 1541 1551 1561 1571 1581 1591 



— „ — — — — — — _-**★*** 

SEQ 1 GCMRGCTGG AAGTCGGAGG ATiW:CGTGCG GTTCGCGCAG GRGCTCGTCA AGCAGGGCGC CGTTGATCTG ATCGATATCA GCAGCGGTGG 

SEQ 2 GCAGAGCTGG AAGTCGGAGG ATACCGTGCG GTTCGCGCAG GAGCTGGTCA AGCAGGGCGC CGTTGATCTG ATCGATATCA GCAGCGGTGG 

SEQ 4 CAA GCCCAGCTGG CGAGGCGTGG ACACTGTCCG ATTTGCCAAG ATCCTGGCAG AAACGGGTTA CGTTGACGTG CTTGACGTGA GCAGTGGCGG. 

SEQ 5 — CAA GCCCAGCTGG CGAGGCGTGG ACACTGTCCG ATTTGCGAAG ATCCTGGCAG AAACGGGTTA CGTTGACGTG CTTGACGTGA GCAGTGGCGG 

SEQ 7 -CGCGGAGTC GGGCAGCTGG GATAT GC AGAGCTCGCT GGAGCTGGTC AAGAAGCTGC CCGAATGGGG CATTGACCTG GTGGATGTCA GCTCCGCCGC 

SEQ 9 GGAATCGTGG AAGCTCTCTG ACTCCGTCCG CTTCGCCGAA GCCCTCGCTG CCCAGGGCGC TATTGACCTG ATCGACGTCT CTTCCGGCGG 

SEQ 11 GAAGCTTGG ACTATTGAAG ATTCCAAAA AATTAGCT GACATTTTAG TAGAAAAGGG TATTGCTTTG GTTGATGTTT CATCTGGTGG 

SEQ 13 GAAGCGTGG TCTACGGAAG ATGCATTGA AGTTGGCC GATCTTGTXA TTGATTTAGG AGTAAAGGTG ATCGACGTTA CATCAGGTGG 

SEQ 15 GGCCA GCCCTCGTGG GACCTCCAGC AGACCATTG AGCTCGCC iVlGATCCTCC CCGACCTCGG CGTCGACCTC CTCGACGTCT CTTCCGGCGG 

SEQ 17 AGGAGGAGGA GGAGACGGAT ACGGCGGAGG AGGTGTTGA- — AGCAGATT GAGCTTTTTG AGCAGTGGGG GATCGACTTT GTCGAGGTTA GCGGTGGCAG 

SEQ 18 — GAGGAGGA GGAGACGGAT ACGGCGGAGG AGGTGTTGA- — ^AGCAGATT GAGCTTTTTG AGCAGTGGGG GATCGACTTT GTCGAGGTTA GCGGTGGCAG 

SEQ 20 GA GCCTAGCTGG GACCTCGAGC AGAGCACAC ^AGCTTGCC AAGCTCCTCC CGGACCTGG& TGTCGACCTG CTCGACGTCA GCTCGGGCGG 

SEQ 21 GA GCCTAGCTGG GACCTCGAGC AGAGCACAC ^AGCTTGCC AAGCTCCTCC CGGACCTGGG TGTCGACCTG CTCGACGTCA GCTCGGGCGG 

SEQ 23 GGATGG GAGATAGZVAG ATACAGTTG CATTAGCA GCGAGGCTTC GCGATGGTGG TGTTGACTTG ATAGATGTTA GCTCTGGTGG 

SEQ 26 

SEQ 28 

SEQ 29 

SEQ 32 ACGAGTTTCC TGAAAGCTGG ACAGTCGAGC AGACTT G TCAACTCGCG CGTATCTTGC CCAAGCATGG AGTAGACTTG GTGGACGTCA GCTCAGGCGG 

SEQ 34 

SEQ 36 — 

SEQ 37 

SEQ 39 

SEQ 41 — AAGAAGTT CGGAAGCTGG GATGTCGAAA GCACGATCA- — AGATCTCC AAAATCCTGG CCGACTTGGG CGTTGATCTG CTCGACGTGT CTTCCGGTGG 

SEQ 43 GGTTTCAAG CCA GAGG AGGCGGTGC ^AGTTGTGC GAGGCCCTCG AGGCCGCGGG CATGGATTTT GTCGAGACGA GCGGCGGCAC 

SEQ 82 — TACGAGGG AGAGACCTGG ACTCTTGAGC AGAGCATCA ^AGCTTGCA CACCAGTTAG CAGACCGTGG TGTCGATGTT TTGGATGTTT CCAGTGGTGG 

SEQ 84 GC CGACTCTTGG ACCGTTGACC AGACGGTTG ^AACTCGCC AAGATGCTCC AAGAGGCTCG AGTCGACCTG CTAGACGTCA GCTCCGGCGG 

1601 1611 1621 1631 1641 1651 1661 1671 1681 1691 



SEQ 2 TGTTCTCGCG CAG . 

SEQ 4 CACTCATTCG GAG 

SEQ 5 CACTCATTCG GAG >■ 

SEQ 7 GAACCACAAG GAC 

SEQ 9 TGTCCACGCC GCG 

SEQ 11 TAACGATTAT AGA 

SEQ 13 AAATGTTGCG CAT 

SEQ 15 CAACAACAAG GAC 

SEQ 17 TTATGAGGAT CCTCAGGTAA GTTTTGGTGT TGTTTGAGGG ATGGGGCAAG GGGTTGTCTG TCGTGAACAA CAAflMGGGC ACGGAACAAA TGCTAACGCC 

SEQ 18 TTATGAGGAT CCTCAG 

SEQ 20 AAACTCGGTG GCC 

SEQ 21 AAACTCGGTG GCC 

SEQ 23 TAATC21CA3M3 GAT 

SEQ 25 ; 

SEQ 26 

SEQ 29 

SEQ 32 TATCCATCCT AAG 

SEQ 34 

SEQ 36 ' 

SEQ 37 

SEQ 41 GAATCATCCT CAG — — — 

SEQ 43 CTATGAGAGT TTT 

SEQ 82 CATCCACAAG ATG 

SEQ 84 CCTGGTTCCA TTC 

1701 1711 1721 1731 1741 1751 1761 1771 1781 1791 

#### ##*####### ########## rnmrn^ #» 

SEQ 1 — — — — — CAG AAGATCAAGT CCGGCCCTGC CTTCCAGGTG CCTTTTGCCG TGGCCGTGAA GAAGGCCGTC GGCGAC 

SEQ 2 — — — -CAG AAGATCAAGT CCGGCCCTGC CTTCCAGGTG CCTTTTGCCG TGGCCGTGAA GAAGGCCGTC GGCGAC 

SEQ 4 CAG CATATCCACG CGAAGCCAGG CTTCCAGGCA CCCTTTGCTA TTGCCGTCAA GAAGGCCGTC GGGGAC 

SEQ S CAG CATATCCACG CGAAGCCAGG CTTCCAGGCA CCCTTTGCTA TTGCCGTCAA GAAGGCCGTC GGGGAC 

SEQ 7 CAG AAGATCAACC TGCACACGGC CTACCAGACG GACCTGGCCG GGCAGATTCG CCAGGCCATC CGAGCG 

SEQ 9 CAG AAGATCAAGT CCGGGCCGGC TTTCCAGGCT CCCTTCGCTG TGGCTATCAA GAAGGCCGTT GGCGAT 

SEQ 11 C AACCACCAAG ATCTGGGATC AGTAAAGAGT TGAGAGAGCC AATCCATGTT CCGTTGTCTC GTGCAATTAA ACAACATGTT GGTGAC 

SEQ 13 T GCAAATCTAG ATATCTATTA AATGACGACA .AACAACTACC TTCTC2Ui.GTG CCCTTGGCTC GTAAATTGAA AAGCCACATT AGAAAC 

SEQ 15 CAG AAGATCAACG TCCACACCTA CTACCAGATC GACATGGCCG AGCAGATCCG CGCGGCCGTG CACGMGCCG 

SEQ 17 ATACAGATGG CCAACGGTCC CAAGCCCGftA AAGTCCGAAC GCACCATGGC CCGCGAG6CC TTCTTCCTCG AGTTCGCCAA GATCATCCGC ACCAM T 

SEQ 18 ATGG CCIUVCGGTCC CAAGCCCGftA AAGTCCGRAC GCACCATGGC CCGCGAGGCC TTCTTCCTCG AGTTCGCCAA GATCATCCGC ACCAAG T 

SEQ 20 ; CAA AAGATCGAGC TCACGCCGTA CTACCAGATC GACCTGGCAG CCAAGATCCG CGAGGCCGTC GGCGAT 

SEQ 21 1 CAA AAGATCGAGC TCACGCCGTA CTACCAGATC GACCTGGCAG CCAAGATCCG CGAGGCCGTC GGCGAT 

SEQ 23 CAA AGAATTGAGG TGAAGGATTG CTATCAAGTT CCTTTTGCGG AAAAGATTAA GGATCAAGTG AATGGA 

SEQ 25 

SEQ 26 

SEQ 28 

SEQ 29 

SEQ 32 TCCGCCATC GCCATCAAGT CCGGTCCTGC TTACCAGGTA GACCTCGCCA AACAGGTAAA GAAGGCTGTT GGCGAT 

SEQ 34 

SEQ 36 

SEQ 37 

SEQ 39 

SEQ 41 CAG AAAATCAACA TGTTCAACAC C — — — — 

SEQ 43 G GTTTTGCGCA CCGCAAGGAG TCCAGCCGCA AGCGGGAG2UV CTATTTTATC GAGTTCGCCG AGGTCATCCG GAAGGCCGTC AAGCAC 

SEQ 82 : CAA AA6GTCGCTG CTGGTCCCGG TTACCAGGCA CCTCTTGCCA AGGCGATCAA GAAGTCAGTT GGAGAC 

SEQ 84 CAA AAAATCACCG TGGGAGCCGG ATACCAGCTA TTCGGAGCAA AAGCCGTTCG CGATGCTCTG GCCAAA 



1801 1811 1821 1831 1841 18S1 1861 1871 1881 1891 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 82 
SEQ 84 



^AAGCT 

lAGCT 

^flAACT 

^flRACT 

GCTGG 

AAGCT 

AAGTT 

CGATG 

GCAAGCAGCT 
TCCCCAAGCT 
TCCCCAAGCT 

^AGGTT 

AGGTT 

^AT 



6CTGGTIGCC 
GCTGGTTGCC 
CGCAGTGGCA 
C6CAGTGGCA 
CGCGTCGACT 
CCTTGTTGCG 
ATTGGTCAGT 
TTTGATCGCA 
CCTCGTCGGT 
TCCTCTCATG 
TCCTCTCATG 
GCTCATAGGC 
GCTCATAGGC 
ACTACTTGGC 



GCCGTGGGTG CCATCACC- 
GCCGTGGGTG CCATCACC- 
TCAGTGGGTA TGATTGCC- 
TCAGTGGGTA TGATTGCC ^ 



-AACG GCAAGCAGGC 
-AACG GCAAGCAGGC 
-AfiCG CGCATTTGGC 



^AGCG CGCATTTGGC 

CTTGTGGGTG CTGTAGGTCT GATCACCGAT TCGGAACAGG CGAGGGGACT AGTTCAGGGA GCGGMGAGG CGACTGCAGC 

ACGGTGGGCA CGATCACG ' AACG GTAAGCAGGC 

TGCGTTGGTG GGCTTGAA — — — — ^A AAGATCCTGA 

TGCAGTGGAG GATTAGAT C GAGACATATT 

GCCGTCGGCT TGGTCACC TCG GCTGAGATCG CCAAGGAGAC CGTCCAGGAfi AAGGAGGATG GCAGAGTCAC 

GTCACCGGCG GCTTCCGC — — ACTC GTCAGGGCAT 

GTCACCGGCG GCTTCCGC — ACTC GTCAGGGCAT 

GCGGTCGGCA ACATCAAC — ACGG CTGACATTGC 

GCGGTCGGCA ACATCAAC— ACGG CTGACATTGC 

GCTGTCGGAA TGATCAGG— GATG GTCTTACGGC 



AGTGT ACTTGTTTCA GCAGTAGGTG GAATCAAG- 



-A CTGGZVCATCT 



^ATGGT GGTCTACACC ACCGGCGGCT TCAAGACG — 

^AAGAT GTTGATCAGC ACTGTTGGTA GCATCAAG — 

-ATCGAACC CG2VCGCGTCC AAACGCAT6C TCGTCGGGG- 



GTGGGCG CCATGGTCGA 

^ATAG GTACCCTTGC 

CCGTG6 GAAT6ATGGA 



1941 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ IS 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SfQ 23 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 82 
SEQ 84 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
.SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 82 
SEQ 84 



GAATCAG 

GAATCAG 

CAATTCC 

CAATTCC 

CGAGGCAATG 

6AACAAG 

ATTGCTCAAC 
TAAACTCGAT 
CATCCAGCGC 

GGAGGCC 

GGAGGCC 

GCGCGATGTC 
GCGCGATGTC 
GAATGAAATC 



ATTCTAG 

^ATTCTAG 

TTGTTGG 

TTGTTGG 

CTGTCGGGAC 

CTGCTTG 

AAATATTTAG 
GAGTTTATTG 
GAGAACGGCG 

GCTTTGG 

GCTTTGG 

GTGGATGAGC 
GTGGATGAGC 
CTAGAAAGTG 



AGAAGG2\C " 

AGGAGGAG ~ 

CTAATGGT 

CCAAGACT -* 

AATCCGAT ^ 

AATCCGAT-""" — — — — — — — — — —————————— —————————— —————————— ______ 

AGGGCGCCGA GAAGGTGGCC GAGGCCAAGC AGACGCATGA CACCATCGAG GTCGTGAGCG AATCACRTGG CGGCAiVGACC 
AGGGCGCCGA GJIAGGTGGCC GAGGCCAAGC AGACGCATGA CACCATCGAG GTCGTGAGCG AATCACATGG CGGCAAGACC 
GAAAAGCT 



TGCTGAA 



-GAGGTTT TGCAATCT 



CGCGCTGCAG GGCGTCGATG GG 

GGAGGAG ATCATCG CTGGAGGAGA GGACGATACC 

AGGTTCC TACGATT CGCCCAAC 



2001 



2011 



2021 



2031 



2061 



2081 



GATATCGACG 
GATATCGACG 
GGACTGGACC 
GGACTGGACC 
AAGGCGGATG 
GGATTGGATG 
ACATTTGATC 
GACTTTGATA 
CGTGCCGATA 
GATTGCGACA 
GATTGCGACA 
AAGGCGGATC 
AAGGCGGATG 
GATG 



TTGCGCTGGT 
TTGCGCTGGT 
TTGTGCTGGT 
TTGTGCTGGT 
CCATTCTGAT 
TTGCGCTTGT 
TTGCTTTGAT 
TAGCATTGAT 
TGGTCCTTGT 
TGATCGGTAT 
TGATCGGTAT 
TGGTCCTCAT 
TGGTCCTCAT 
TTACTTTTGT 



TGGCCGTGGG 
TGGCCGTGGG 
TGGACGTGGC 
TGGACGTGGC 
AGCCCGTCAG 
GGGACGTGGT 
CGGTAGAGGA 
AGGTAAAGGA 
TGCCAGGCAG 
CG6ACGCCCG 
CGGACGCCCG 
TGCTCGCCAG 
TGCTCGCCM 
C6CAAGGGAG 



TTCCAGAAGG 
TTCCAGAAGG 
TTCCAGAAGA 
TTCCAGAAGA 
TTCCTGCGCG 
TTCCAGAAGG 
TTTTTAAGAA 
TTTCTCAA2\A 
TTCTTGAAGG 
GCCATCATCA 
GCCATCATCA 
TTCCTGCGCG 
TTCCTGCGCG 
TTCTTAAGGA 



ATCCCGGTCT 
ATCCCGGTCT 
ACCCGGGGCT 
ACCCGGGGCT 
AGCCAGAATG 
ATCCCGGTCT 
ATCCAGGTTT 
ACACTGGATT 
AfiCCCGAGTT 
ACCCTTCGCT 
ACCCTTCGCT 
AGCCTGAGTT 
AGCCTGAGTT 
ACCCGTCGTT 



GGCCTGGACG 
GGCCTGGACG 
GGTGTGGGCG 
GGTGTGGGCG 
GGTGTTTTCC 
GGCGTGGACT 
GGTATGGGAG 
GATCAGCCGT 
CGTCCTCACT 
TCCCGCCAAC 
TCCCGCCAAC 
TGTGCTGAGG 
TGTGCTGAGG 
GGT6CTAGAC 



TTTGCTCAGC 
TTTGCTCAGC 
TGGGCCGACG 
TGGGCCGACG 
ACGGCGAGAA 
TTCGCGCAGC 
TTTGCCGATA 
ATTGCTGACC 
GTCGCCGACG 
TTGATCCTCA 
TTGATCCTCA 
ACGGCGCATA 
ACGGCGCATA 
AGCGCGAACC 



ACCTCGGCGT C— 
ACCTCGGCGT C— 

AGCTGAATGT A— 
AGCTGAATGT A — 
AGTTGGGCGT G— 
ATCTTGATGT 
AACTTGGTGT 
AATTGCAAGC 
AGTTGGGTGT 
ACCCGGAGGT 
ACCCGGAGGT 
ACCTTGGGGT 
ACCTTGGGGT 



AGTTGGGTGA A- 



GGTATCGACA TTGTGAGGGC TGGACGTTGG TTCCAACAGA ATCCTGGTCT GGTTCGAGCT TTTGCTAACG AGCTTGGCGT G- 



ATAGGCAT CGGGCGCGCA GCCGGTTCGG AGCCGGACCT CGCCAAGGAC ATCATCGCGG GCAAGGTGTC CAGCATTATC AAATACGCCA 

CCCTTGGATC TTGTGGCTTC AGGCCGTCTG TTCCAGAAGA ACACTGGACT T6TTTGGTCA TGGGCTGACG ATCT(SAACAC T 

GGCCAAGACC GCAGCCAGAT TGGCAAGTTG GCCGAGCAGT CGATTCAGAG CGGAGAGTGT GATGCGGTAC TGTTGGCACG T GGATTGA 



2101 2111 2121 2131 2141 2151 2161 2171 2191 2191 

_ -k * It -k ****** * » „ 

SEQ 1 GflflA TCTCCATGGC CaACCASATC CGCTGGGGCT TCRCCCGGCG TGGAGGCACC CCGTACATTG ATCCTTCGGT 

SEQ 2 GA2UI TCTCCATGGC CAACCAGATC CGCTGGGGCT TCACCCGGCG TGGAGGCACC CCGTACATTG ATCCTTCGGT 

SEQ 4 GAGA TCTCCATGGC TAATCAEATC CGATGGGGTT TCTCGCGGCG CGGTGCTGGT CCTTACCTCA GGIUUSAAACT 

SEQ S GAGA TCTCCATGGC TAATCASATC CGATGGGGTT TCTCGCGGCG CGGTGCTGGT CCTTACCTCA GGAAGAAACT 

SEQ 7 CCGG TGACTGTCCC GGTGCAGTTT GGCAGGGCCA TTTAG 

SEQ 9 GAGA TTGCGATGGC GAGTCAGATT CGGTGGGGAT TCACAAGGCG CGGGGGCACG CCTTATATCG ACCCCAAAGC 

SEQ 11 AGAC TCCACCAGGC CTTGCAGTTA GGTTGGGGTT TCTGGCCCAA CAAACAACAA ATTGTTGATT TGATTGAAAG 

SEQ 13 CAAT TCAGAACAGC ACCTCAATAT AAGTTGGCCT TATCATAA 

SEQ IS GATG TCAAGGCCCC TGTTCAGTAC CTCCGTGGTC CTCTTAGCAG CAGGCCCAAG AAGTTGACCA CTGTTCCTTA 

SEQ 17 CCGG ATGCGGATGC CCGCTTGTTC GACAAGAAGA GGGCTGAGCC GCACTGGATC GTTGAGAAGT TGGGCATGAA 

SEQ 18 .: CCGG ATGCGGATGC CCGCTTGTTC GACAAGAAGA GGGCTGAGCC GCACTGGATC GTTGAGAAGT TGGGCATGAA 

SEQ 20 AATG TGCAGTGGCC TCACCAATAC CACAGAGCAG TGTGGCGC2UI GGGTGCAAGG ATTTGA 

SEQ 21 AATG TGCAGTGGCC TCACCAATAC CACAGAGCAG TGTGGCGCAA GGGTGCAAGG ATTTGA 

SEQ 23 ^AATG TTGCATGGCC AGTTCAGTAT GACTATGCAG TTAAGGGACA CAGAAAGTTA CGTTGA — 

SEQ 28 

SEQ 29 

SEQ 32 GAGG TCftAGATGGC GftACC2\GATT GATTGGAGCT TCAAGGGACG TGGAAAGAAA GTGAACAAGA GTTCTTTATA 

SEQ 3 6 — . — — — 

SEQ 37 

SEQ 39 

SEQ 41 ■ 

SEQ 43 TGGGGGAGGA CGAGTTTGTG CTGCAGTTGA CTGCCTGCTC GGCGC2UUV.TA AGGCTGATGG CCAAGGGCGA GGAGCCGTTT GAC 

SEQ 82 TCTA TCCAGATCGC TCATCAGATC GCATGGGGTT TCG6TG6CAG AGCTAAGAAfi AACGCTCCCA AGCTTGTCTT 

SEQ 84 TGTCCTACCC AAGCTGGACC GAGGATGCTA GTGTAGCGCT GATGGGTJVCC AGGGCAGCTG GCAACCCGCA GTACCATCGC GTTCflCGTGG CTAAGAAGTG 

2201 2211 2221 2231 2241 2251 2261 2271 2281 ' 2291 



SEQ 1 GTACAAGCAfi TCTATTT.TCG ATGTATAfi— 

SEQ 2 GTACAAGCAG TCTATTTTCG ATGTATAG— 

SEQ 4 CGAGAAGATA TAA 

SEQ 5 CGAGAAGATA TAA 

SEQ 7 

SEQ 9 TTATAAGG2\G AGCATCTTTG AGTAA 

SEQ 11 AACATCTAAA TTAGAAGTAA ATTAG 

SEQ 13 
SEQ IS 



SEQ 17 GTCCATTGTT GGTGCTGGTG TTGAGGTGGT ACGTCACGTT CCAACCCCAT TTGCTTCATT GTGTTTCC6A GTATGTCATG CT6ACTTGGT TCTTTTCTAG 

SEQ 18 GTCCATTGTT GGTGCTGGTG TTGAGGTG 

SEQ 20 — — 

SEQ 21 

SEQ 25 

SEQ 34 

SEQ 36 

SEQ 37 

SEQ 39 

SEQ 41 



SEQ 43 ^ATCTC AAACGCCGAC GAGGTGGCGC GGGTGACGCA GTTGATGGCG 

SEQ 82 A 

2301 2311 2321 2331 2341 2351 2361 2371 2381 2391 



SEQ I AGTATAGATA GAGTTGAAGA TGATACCTCA TAGACGATCA ATGGACCCTT GCATATTATT TCTCGTCTCC TGCGTATGTT CAAGGTATTC ACAGTAGCTG 
SEQ 2 AGTATAGATA GAGTTGAAGA TGATACCTCA TAGACGATCA ATGGACCCTT GCATATTATT T 

SEQ 5 

SEQ 7 

SEQ 9 

SEQ 11 
SEQ 13 
SEQ IS 



SEQ 17 ACGTGGTATG TGAGCGAGCT CAAGAAGCTG GCCAAGTTTT AG- 

SEQ 18 ACGTGGTATG TGAGCGAGCT CAAGAAGCTG GCCAAGTTTT AG- 

SEQ 20 

SEQ 21 

SEQ 23 

SEQ 25- 

SEQ 26 

SEQ 28 

SEQ 29 

SEQ 32 

SEQ 34 

SEQ 3 6 

SEQ 37 

SEQ 39 

SEQ 41 

SEQ 43 GAGGGCAAGG TG 

SEQ 82 

SEQ 84 



2401 2411 2421 2431 2441 2451 2461 2471 2481 2491 

SEQ 1 CGTCCTCTTA AGTTTCTCCG TCATTCGTTC TATTCTiVCTC CAATCGCARC GCATGGCGAC CACGGATCGA GTCGZiATTTC TCCGTCGTTC CmTCTGATC 

SEQ 2 ' " ~ IZ~IIIII IIIIIIIIII IIIIIZIIII IIIIIIIIII 

SEQ 7 ~ 

SEQ 9 IIIIIIIII IIIIIIIIII IIIIIIIIII IIIIIIIIII 

SEQ 13 ~ 2" ~~~~ IIIIIIIIII 

SEQ 20 ZZZZ IIIIIIIIII 

SEQ 23 " ~ 2Z 

SEQ 32 

SEQ 34 " 

SEQ 36 

SEQ 37 



2501 2511 2521 2531 2541 2551 25S1 2571 2581 



SEQ 1 AATATAIU^ GCGGGGAATG GCTTGACCCC GCGCRGRRTG TCGMCTCTT CGCAZWW:TCT CGGTGTMAG GACGCTCaGC AftCGATCAZWS G 

SEQ 4 " 

SEQ 5 

SEQ 7 

SEQ 9 I'lIIIII I 

SEQ 13 

SEQ 15 • 

SEQ n 

SEQ 18 

SEQ 20 2 IIIIIIIIII IIIIIIIIII I 

SEQ 23 ~ IIIIIII IIIIIIIIII I 

SEQ 26 mil IIIIIIIIII I 

SEQ 34 mil I 

SEQ 37 

SEQ 39 

SEQ 41 * : 

SEQ 43 " IIIIIIIIII I 

SEQ 84 



Figure 2. A multiple alignments of the 2031 OR nucleic acid 
sequence from A. fumlgatus (SEQ 1,2) along with related 2031 
ORs from other fungi and bacteria (see also Example 4) . 
Regions 1-11, marked with. * or refer to regions conserved 
at the amino acid level between Ors but not OYEs . 

Fungal 2031 ORs are given by SEQ ID No.: SEQ ID Nos . 1, 2, 
4, 5, and 1 , A. fumlgatus; SEQ ID No. 9, A.nidulanse SEQ ID 
Nos. 11 and 13, C. albicans? SEQ ID Nos. 15, 17 and 18, N. 
crassa? SEQ ID Nos. 20, 21 and 43, M. grlsea; SEQ ID No. 23 
(NP__595868) , S. pomhe; SEQ ID Nos. 25 and 26, C. trlfolll; 
SEQ ID Nos. 28, 29, 31, 32 and 34, F. sporotrichioides} SEQ 
ID Nos, 36^ 37 and 82, F. gramlnearum; SEQ ID -Nos. 39 and 
41, M. gramlnlcola; SEQ ID No. 84, U. maydls . 



Oh 0.5h 1h 2h 4h 24h 
A ««+« + ^+ « + » + IPTG 



B 




Figure 3. Recombinant 2031 OR. (A) Time course of recombinant 2031 OR induction 
over 24 flours after the addition of IPTG (samples without IPTG are also shown). The gel 
was stained with coomassie; A prominent band of the correct molecular weight (marked 
with an arrow) Is seen. (B) Coomassie stained gel showing purified recombinant 2031. 



4961 ; A. fumigatus 



1-79 



82 



68 



OA 



100 



SEQ ID No. 43; M. grisea '\ 



SEQ ID No. 19; A/, crassa 



100 



97 



90 



SEQ ID No. 14; C. albicans 

SEQ ID No, 12; C. albicans 

SEQ ID No. 24; S. pom/3e 

SEQ ID Nos. 30 + 33; F. spomtn'chioides 

— SEQ ID No. 6; A, fumigatus 



100 



100 



I SEQ ID No. 3; A fumigatus 

100 

SEQ ID No. 10; A nidulans 



55 



92 



— SEQ ID No. 8; A fumigatus 
SEQ ID No. 16; A/, crassa 



97 



55 



71 



SEQ ID No. 22; M. grisea 

NP_295913 

NP_625402 

AF320254 

T44612 



Bacterial 



|— 6-2460; C. albicans 

100 

A36990; C. albicans 



NCU04452.1; N, cmssa 

4875: A fumigatusV ^^nga! 

^ OYEs 



L_ 93 



OYE3; S, cerevisiae 



OYE2; S. cerevisiae 



. OYE1 ; S. cerevisiae J 



J 



Fungal 
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Figure 4, Phylogenetic tree showing relationships between A, fumigatus 2031 OR and 
similar proteins. This demonstrates a 2031 OR clade, which can be distinguished from 
the OYE proteins. 




Figure 5: NADPH dehydrogenase activity of recombinant 2031 OR with cyclohexenone 
(CHX), N-ethylmaieimide (NEM), menadione (MEN) or duroquinone (DQ) as substrates. 
Final concentrations in the assay were as follows: 500 \xM substrate, 120 ixM NADPH, 1 
^g/200 |LiL 2031 OR, 



